Thread-pool-cpp - High performance C++11 thread pool



Build Status Codecov branch MIT licensed

  • It is highly scalable and fast.
  • It is header only.
  • No external dependencies, only standard library needed.
  • It implements both work-stealing and work-distribution balancing startegies.
  • It implements cooperative scheduling strategy.

Example run: Post job to thread pool is much faster than for boost::asio based thread pool.

Benchmark job reposting
***thread pool cpp***
reposted 1000001 in 61.6754 ms
reposted 1000001 in 62.0187 ms
reposted 1000001 in 62.8785 ms
reposted 1000001 in 70.2714 ms
***asio thread pool***
reposted 1000001 in 1381.58 ms
reposted 1000001 in 1390.35 ms
reposted 1000001 in 1391.84 ms
reposted 1000001 in 1393.19 ms

See benchmark/benchmark.cpp for benchmark code.

All code except MPMCBoundedQueue is under MIT license.

  • Make library easy to install, put everything into `tp` namespace, add optional `boost::future` support

    Make library easy to install, put everything into `tp` namespace, add optional `boost::future` support

    • Moved everything to "./include/thread_pool"
    • Main include file is now: <thread_pool/thread_pool.hpp>
    • Everything is now in the tp namespace
    • Added optional boost::future compatiblity with THREAD_POOL_USE_BOOST macro
    • Added make install command to CMakeFiles.txt
    • Adapted tests and benchmarks to new changes
    • Use header guards in place of #pragma once
    • Prepend THREAD_POOL_ to all header guards
    • Automatic clang-format on every file
    opened by vittorioromeo 17
  • [HOLD MERGE PLEASE] ThreadPool compile time template parameter; CMake + CI cross platform work

    [HOLD MERGE PLEASE] ThreadPool compile time template parameter; CMake + CI cross platform work

    @inkooboo , @SuperV1234 : Please see below and share any comments. Once this looks good, I can run some cross platform tests, we can merge, and I will look at adding basic CI tests (this can be run by hunter for each release by default if the package is accompanied with an example build test).

    • add TSettings template parameter to ThreadPool to specify task_size at compile time, and for forward API stability, updated internal code...
    • include updates from discussion in
    • add explicit project(thread-pool-cpp VERSION 1.1.0) : 1.0.0 used in previous hunter release
    • add cmake platform checks for thread_local and various fallbacks (FATAL_ERROR if no found)
    • use internal ATTRIBUTE_TLS macro for portability (see above): this allows thread-pool-cpp to be functional on many incomplete c++11 platforms w/ lambda initialization extensions
    • add THREAD_POOL_CPP_BUILD_{TEST,BENCHMARKS} to manage tests
    • minor compiler error/warning fixes
    • bump cmake_minimum_required(VERSION 3.3) for more modern cmake policy defaults
    • add cmake package config installation for clean find_package use (see example below)
    • remove boost dependency from example, per discussion with @inkooboo
    • add ThreadPool process() api back for packaged_task interface
    • manage tests with CTest (can deploy-on-success, etc)
    Test project /Users/dhirvonen/devel/elucideye/drishti/src/3rdparty/thread-pool-cpp/_builds/xcode
        Start 1: FixedFunctionTest
    1/2 Test #1: FixedFunctionTest ................   Passed    0.00 sec
        Start 2: ThreadPoolTest
    2/2 Test #2: ThreadPoolTest ...................   Passed    0.01 sec

    The current installation path and package config files look like this:

    tree _install/xcode/  <== `i.e., /usr/local, etc`
    ├── include
    │   └── thread_pool
    │       ├── fixed_function.hpp
    │       ├── mpsc_bounded_queue.hpp
    │       ├── thread_pool.hpp
    │       └── worker.hpp
    └── lib
        └── cmake
            └── thread-pool-cpp
                ├── thread-pool-cppConfig.cmake
                ├── thread-pool-cppConfigVersion.cmake
                └── thread-pool-cppTargets.cmake

    With this configuration, the project can be installed and included using:

    find_package(thread-pool-cpp CONFIG REQUIRED)
    target_link_libraries(my_exe thread-pool-cpp::thread-pool-cpp)

    It will also support automatic package management through hunter if used in a project w/ a top level HunterGate() command with one additional line. The previous include path addition allows both direct/submodule use and post install (package) use with similar #include syntax:

      add_library(thread-pool-cpp::thread-pool-cpp ALIAS thread-pool-cpp)
    else() # use hunter
      find_package(thread-pool-cpp CONFIG REQUIRED)
    target_link_libraries(my_exe thread-pool-cpp::thread-pool-cpp)
    opened by headupinclouds 7
  • Feature request: post tasks to front of queue (LIFO)?

    Feature request: post tasks to front of queue (LIFO)?

    I need to prefer new tasks over old tasks in the queue. Is it possible to change the mpmc queue to post to the front of the queue, or to add a new method post_front()?

    opened by emmenlau 6
  • Question - Sample & clear example

    Question - Sample & clear example

    Dear Andrey

    do you have any manual or sample for the usage of your thread pool implementation? I can not find anything here, about how can i use it or the architecture and options that exist in your pool and what is the exact behavior of all these things.


    opened by mohsenomidi 5
  • Idling Performance

    Idling Performance

    I wanted to share some results relating to the performance of this thread pool when all member threads are idle. These results are caused by the waiting loop in threads that are idle.

    thread-pool-cpp_idle_gcc thread-pool-cpp_idle_msvc

    Though its probably not a great idea to run 500 threads on a machine with 8 logical cores, a large number of threads can be needed for blocking i/o operations, and the idle CPU usage overhead can burden and drain the battery life of client applications.

    opened by SeverTopan 5
  • change the implementation of thread_pool_options to a .cpp file

    change the implementation of thread_pool_options to a .cpp file

    I need to use the ThreadPoolOptions to set the thread count and the queue size. Since the ThreadPool's ctor only accepts the ThreadPoolOptions, I met an "multiple definition of" error when linking in my program. May you remove the implementation of ThreadPoolOptions to antoher .cpp file or add a ctor in ThreadPool to control the queue size and thread count?

    opened by MatthewButterfly 4
  • c++11 support

    c++11 support

    I am interested in using this with c++11 projects. It seems the only feature requiring c++14 is the capture-by-move initialization for the packaged_task lambda assignment here:

    I'll need to familiarize myself with this. It seems a workaround might be possible in c++11 based on this document ("evil_wrapper"):

    I'll take a look. Any suggestions are welcome.

    opened by headupinclouds 3
  • license?


    This looks really nice. I'm interested in adding this as a package to the hunter project (ruslo/hunter#251), but I don't see a license anywhere. Are you amenable to a simplified BSD type license (or similar)? Thanks.

    opened by headupinclouds 3
  • inline thread_id() does not play with multiple compilation units (GCC 4.9)

    inline thread_id() does not play with multiple compilation units (GCC 4.9)

    The decl of thread_id as inline static causes issues with multiple compilation units under GCC 4.9, for example:

    In the test directory create a new file getWorkerIdForCurrentThread.cpp:

    #include <worker.hpp>
    size_t getWorkerIdForCurrentThread() { return *thread_id(); }
    size_t getWorkerIdForCurrentThread2() { return Worker::getWorkerIdForCurrentThread(); }

    Declare the new functions at the top of thread_pool.t.cpp:

    size_t getWorkerIdForCurrentThread();
    size_t getWorkerIdForCurrentThread2();

    Update doTest("post job"), add the following line in the lambda after std::packaged_task<int()> ....:

    printf("\nThread id(1): %lu, id(2): %lu, id(3): %lu, id(4): %lu\n", 
      Worker::getWorkerIdForCurrentThread(), *thread_id(), 
      getWorkerIdForCurrentThread(), getWorkerIdForCurrentThread2());

    Build and run thread_pool.t.cpp as usual, the output will show the new line:

    Thread id(1): 7, id(2): 7, id(3): 4294967295, id(4): 7

    Compile with -O3 and we get even worse output (due to aggressive inlining):

    Thread id(1): 7, id(2): 7, id(3): 4294967295, id(4): 4294967295

    In essence getWorkerIdForCurrentThread.cpp contains another instance of tss_id per thread since it was inlined by GCC and is never initialized to a valid value by the thread pool. Now when we call these functions we always see -1ul regardless of the thread we call from.

    This limits the usefulness of the thread pool - at best you have to be extremely careful what you call and where when you have multiple CPPs.

    opened by craigminihan 3
  • Explain getWorker

    Explain getWorker

    Hi, please explain why this is so? It seems a fairly frequent use case is pushing from one thread of a bunch of tasks. I understand the idea of ​​locality, but at first glance it doesn't seem to work? After all, if we push a task from the thread of the thread pool, this does not necessarily mean that its number is equal to its number in the array

    Maybe something like this would be work better?

    auto idx = current thread id;
    if last_idx == idx {
         return last_worker;
    for (worker : workers) {
         if (idx == worker.idx) {
             last_idx = idx;
             last_worker = worker;
             return worker;
    Logic with m_next_worker

    Or something like this

    thread_local worker_index{-1}; // set in worker::start
    if (worker_index != -1) {
         return workers[worker_index];
    Logic with m_next_worker
    opened by MBkkt 2
  • Task may drop when thread pool destruct

    Task may drop when thread pool destruct

    thread pool call worker's stop method when destruct and set worker's runing flag to false worker's thread proc exit when runing flag is set to false, ignore weather more tasks in its task queue

    can worker wait all tasks finish when thread pool destruct

    opened by machunleilei 2
  • ThreadPool must execute all tasks posted to it

    ThreadPool must execute all tasks posted to it

    Currently, ThreadPool's destructor does not wait for completion of all tasks posted to it, because Worker relies on m_running_flag to stay in the loop, as a result this flag is set to false as soon as ThreadPool destructor is invoked, as the threads exit without further processing the tasks from the queue.

    One solution is to remove m_running_flag from the worker, and instead add a poison task to the queue and each worker reads this poison task and exit but before exiting enqueue it again, so that other workers can exit in similar mannner.

    Also, ThreadPool may provide a function called wait() function which will add the poison task. Also, the destructor should call this wait() function.

    opened by snawaz 19
  • Wrong sequence Destructor / Constructor and task is not processed

    Wrong sequence Destructor / Constructor and task is not processed

    Hi there,

    I'm trying to use this thread-pool "LIB" and before using it, I wanted to make some stress tests on it to identify its limit, its hardening and reliability.

    I wrote a really short main function and set a small amount of thread (1) with also a small file queue size (2). The behavior I can see is really strange : Some destructor may be called before constructor :huh: and the same input may be treated by several tasks

    Here is the small main.cpp :

    #include <iostream>
    #include <regex>
    #include <vector>
    #include <signal.h>
    #include <unistd.h>
    // #include "logger.h"
    #include <future>
    #include <utility>
    #include "thread_pool.hpp"
    static volatile bool         g_theEnd = false ;
    #define DEBUG(...)   { printf(__VA_ARGS__) ; printf("\n") ; }
    #define INFO(...)    { printf(__VA_ARGS__) ; printf("\n") ; }
    #define WARNING(...) { printf(__VA_ARGS__) ; printf("\n") ; }
    #define ERROR(...)   { printf(__VA_ARGS__) ; printf("\n") ; }
    #define LOG_INIT(a)
    #define SET_LOG_LEVEL(a)
    #define LOG_END()
    void cleanup(void)
        // Ask the threads to give up
        g_theEnd = true ;
        INFO("Stop test-thread-pool") ;
        LOG_END() ;
    class NgapMessageDecode
        NgapMessageDecode(int fd) : m_fd(fd)
            DEBUG("Constructor %.8ld, fd=%.2d", this, fd) ;
        virtual ~NgapMessageDecode()
            DEBUG("Destructor  %.8ld, fd=%.2d", this, m_fd) ;
        void operator()()
            DEBUG("Decode %.2d, this=%ld, thread=%.8ld", m_fd, this, pthread_self()) ;
            // sleep(1) ;
        // std::promise<void> *    m_waiter ;
        int                     m_fd ;
    int main(int argc, char * argv[])
        LOG_INIT("test-thread-pool") ;
        INFO("%s", "") ;
        INFO("Start test-thread-pool") ;
        tp::ThreadPoolOptions   threadPoolOption ;
        threadPoolOption.setThreadCount(1) ;
        threadPoolOption.setQueueSize(2) ;
        tp::ThreadPool  threadPool(threadPoolOption);
        for(int i=0; i<100; i++)
            catch(std::runtime_error & e)
                std::cout << e.what() << std::endl ;
        sleep(1) ;
        cleanup() ;
        return 0 ;

    The output is :

    Start test-thread-pool
    Constructor 140734462742560, fd=00
    Destructor  140734462742176, fd=00
    Destructor  140734462742560, fd=00
    Constructor 140734462742560, fd=01
    Destructor  140734462742176, fd=01
    Destructor  140734462742560, fd=01
    Constructor 140734462742560, fd=02
    Destructor  140734462742560, fd=02
    thread pool queue is full
    Constructor 140734462742560, fd=03
    Destructor  140734462742560, fd=03
    thread pool queue is full
    Constructor 140734462742560, fd=04
    Destructor  140734462742560, fd=04
    thread pool queue is full

    How can the Destructor 140734462742176, fd=00 (line2) can be called before the constructor ?

    More over, this same Destructor refers to the object where fd==00 whereas the corresponding constructor below refers to fd==01

    Thus, how can I got the sequence described in the 3 first lines : Constructor / Destructor / Destructor with the same FD==00

    So, I probably missed something or at least misunderstood how I'm supposed to use this "LIB" but I can't find out what's the good practice :/

    Branch used : master I can't use the branch round-robin-stealing 'cause it refers to std::exchange that is defined in C++14 and I must stay in C++11

    Command to build the main application listed above, to facilitate :

    g++ -c   -DLOG -I<path_to_thread-pool-cpp>/include/thread_pool -Wall -Wextra -g -std=c++11 -o main.o main.cpp
    g++ -g   -o test-thread-pool main.o -lpthread

    All kind of help would be appreciated :+1:

    opened by CyrilleBenard 6
  • Why does the function with arguments have errors?

    Why does the function with arguments have errors?

    error C2660 “std::packaged_task<int (int)>::operator ()”: The function does not take zero arguments

    #include "thread_pool.hpp"
    #include  "thread"
    #include "future"
    #include "functional"
    #include "memory"
    #include "iostream"
    using namespace std;
    int f(int j)
    	return j;
    int main(int argc, char **argv)
    	tp::ThreadPool pool;
    	std::packaged_task<int(int)> t(f);
    	std::future<int> r = t.get_future();;
    	//while (1);
    	return 0;
    opened by mirro187 16
  • CMake option to disable tests compilation

    CMake option to disable tests compilation

    First of all thank you for making this library installable. I use External_Project_Add() for all libraries, it compiles the library in an isolated environment and installs it in the specified directory (if you are interested in more details I can explain).

    What I want is a way to disable tests when building the library.

    Wrap this block in an if

    like this

    then I will be able to control that block with -DENABLE_TESTS=ON/OFF

    opened by 01e9 0
Andrey Kubarkov
Andrey Kubarkov
Thread-pool - Thread pool implementation using c++11 threads

Table of Contents Introduction Build instructions Thread pool Queue Submit function Thread worker Usage example Use case#1 Use case#2 Use case#3 Futur

Mariano Trebino 655 Dec 27, 2022
A C++17 thread pool for high-performance scientific computing.

We present a modern C++17-compatible thread pool implementation, built from scratch with high-performance scientific computing in mind. The thread pool is implemented as a single lightweight and self-contained class, and does not have any dependencies other than the C++17 standard library, thus allowing a great degree of portability

Barak Shoshany 1.1k Jan 4, 2023
High Performance Linux C++ Network Programming Framework based on IO Multiplexing and Thread Pool

Kingpin is a C++ network programming framework based on TCP/IP + epoll + pthread, aims to implement a library for the high concurrent servers and clie

null 23 Oct 19, 2022
Arcana.cpp - Arcana.cpp is a collection of helpers and utility code for low overhead, cross platform C++ implementation of task-based asynchrony.

Arcana.cpp Arcana is a collection of general purpose C++ utilities with no code that is specific to a particular project or specialized technology are

Microsoft 67 Nov 23, 2022
Cpp-concurrency - cpp implementation of golang style concurrency

cpp-concurrency C++ implementation of golang style concurrency Usage Use existing single header concurrency.hpp or run script to merge multiple header

YoungJoong Kim 14 Aug 11, 2022
Pool is C++17 memory pool template with different implementations(algorithms)

Object Pool Description Pool is C++17 object(memory) pool template with different implementations(algorithms) The classic object pool pattern is a sof

KoynovStas 1 Nov 18, 2022
A easy to use multithreading thread pool library for C. It is a handy stream like job scheduler with an automatic garbage collector. This is a multithreaded job scheduler for non I/O bound computation.

A easy to use multithreading thread pool library for C. It is a handy stream-like job scheduler with an automatic garbage collector for non I/O bound computation.

Hyoung Min Suh 12 Jun 4, 2022
An easy to use C++ Thread Pool

mvThreadPool (This library is available under a free and permissive license) mvThreadPool is a simple to use header only C++ threadpool based on work

Jonathan Hoffstadt 30 Dec 8, 2022
An ultra-simple thread pool implementation for running void() functions in multiple worker threads

void_thread_pool.cpp © 2021 Dr Sebastien Sikora. [email protected] Updated 06/11/2021. What is it? void_thread_pool.cpp is an ultra-simple

Seb Sikora 1 Nov 19, 2021
EOSP ThreadPool is a header-only templated thread pool writtent in c++17.

EOSP Threadpool Description EOSP ThreadPool is a header-only templated thread pool writtent in c++17. It is designed to be easy to use while being abl

null 1 Apr 22, 2022
Work Stealing Thread Pool

wstpool Work Stealing Thread Pool, Header Only, C++ Threads Consistent with the C++ async/future programming model. Drop-in replacement for 'async' fo

Yasser Asmi 5 Oct 29, 2022
MAN - Man is Thread Pool in C++17

Introduction MAN is a ThreadPool wrote in C++17. The name is chosen because, at least in France, it is said that men are not able to do several things

Antoine MORRIER 6 Mar 6, 2022
ThreadPool - A fastest, exception-safety and pure C++17 thread pool.

Warnings Since commit 468129863ec65c0b4ede02e8581bea682351a6d2, I move ThreadPool to C++17. (To use std::apply.) In addition, the rule of passing para

Han-Kuan Chen 124 Dec 28, 2022
CTPL - Modern and efficient C++ Thread Pool Library

CTPL Modern and efficient C++ Thread Pool Library A thread pool is a programming pattern for parallel execution of jobs,

null 1.1k Dec 22, 2022
ThreadPool - A simple C++11 Thread Pool implementation

ThreadPool A simple C++11 Thread Pool implementation. Basic usage: // create thread pool with 4 worker threads ThreadPool pool(4); // enqueue and sto

Jakob Progsch 6.1k Jan 7, 2023
A modern thread pool implementation based on C++20

thread-pool A simple, functional thread pool implementation using pure C++20. Features Built entirely with C++20 Enqueue tasks with or without trackin

Paul T 151 Dec 22, 2022
Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort.

Bolt is a C++ template library optimized for heterogeneous computing. Bolt is designed to provide high-performance library implementations for common

null 360 Dec 27, 2022
Concurrency Kit 2.1k Jan 4, 2023
C++-based high-performance parallel environment execution engine for general RL environments.

EnvPool is a highly parallel reinforcement learning environment execution engine which significantly outperforms existing environment executors. With

Sea AI Lab 709 Dec 30, 2022