Understanding and implementing thread pool

Table of contents

1. The concept of thread pool

Second, the advantages of the thread pool

3. Application scenario of thread pool

Fourth, implement the thread pool

5. Thread Pool Demonstration


1. The concept of thread pool

Thread pool is a thread usage pattern

Too many threads will bring scheduling overhead, which in turn affects cache locality and overall performance

The thread pool maintains multiple threads, waiting for the supervisor to assign tasks that can be executed concurrently

Second, the advantages of the thread pool

  • The thread pool avoids the cost of creating and destroying threads when processing short-lived tasks
  • The thread pool can not only ensure that the core is fully utilized, but also prevent excessive scheduling

Note:  The number of threads available in the thread pool should depend on the number of concurrent processors, processor cores, memory, network sockets, etc. available

3. Application scenario of thread pool

  1. A large number of threads are required to complete the task, and the time to complete the task is relatively short. If the WEB server completes the web page request, it is very appropriate to use the thread pool technology. Because a single task is small, but the number of tasks is huge. But for long-term tasks, such as a Telnet connection request, the advantages of the thread pool are not obvious, because the Telnet session time is much longer than the thread creation time
  2. Performance-critical applications, such as requiring the server to respond quickly to client requests
  3. An application that accepts a large number of sudden requests, but does not cause the server to generate a large number of threads. A large number of sudden customer requests will generate a large number of threads in the absence of a thread pool. Although the maximum number of threads in most operating systems is not a problem in theory, generating a large number of threads in a short period of time may cause the memory to reach the limit and cause errors.

Fourth, implement the thread pool

 The thread pool is essentially a producer consumer model, which includes a task queue and several threads

  • Multiple threads in the thread pool are responsible for fetching tasks from the task queue and processing the fetched tasks
  • The thread pool provides a PushTask() interface for an external thread (main thread) to push tasks into the task queue

log module

The complete log function has at least log level and time. It is best to support user customization (log content, file line, file name, etc.) 

#pragma once
#include <iostream>
#include <string>
#include <cstdio>
#include <cstdarg>
#include <ctime>

//日志级别
#define DEBUG   0
#define NORMAL  1
#define WARNING 2
#define ERROR   3
#define FATAL   4
const char *gLevelMap[] = {
    "DEBUG",
    "NORMAL",
    "WARNING",
    "ERROR",
    "FATAL"
};

void LogMessage(int level, const char *format, ...)
{
#ifndef DEBUG_SHOW
    if(level== DEBUG) return;
#endif
    //标准部分
    char stdBuffer[1024];
    const time_t timestamp = time(nullptr);
    struct tm* local_time = localtime(&timestamp);
    snprintf(stdBuffer, sizeof stdBuffer, "[%s] [%d-%d-%d-%d-%d-%d] ", gLevelMap[level], 
        local_time->tm_year + 1900, local_time->tm_mon + 1, local_time->tm_mday, local_time->tm_hour, local_time->tm_min, local_time->tm_sec);

    //自定义部分
    char logBuffer[1024]; 
    va_list args;
    va_start(args, format);
    vsnprintf(logBuffer, sizeof logBuffer, format, args);
    va_end(args);
    printf("%s%s\n", stdBuffer, logBuffer);
}

task module

The thread pool stores tasks one by one, and the tasks are encapsulated below

No matter what type the task is, the task class must contain a functor, and only need to call operator() when processing this type of task

#pragma once
#include <iostream>
#include <string>
#include <functional>
#include "Log.hpp"

typedef std::function<int(int, int)> fun_t;
class Task
{
public:
    Task(){}
    Task(int x, int y, fun_t func):_x(x), _y(y), _func(func) {}
    void operator ()(const std::string &name) {
        LogMessage(NORMAL, "%s处理完成: %d+%d=%d", name.c_str(), _x, _y, _func(_x, _y));
    }
public:
    int _x;
    int _y;
    fun_t _func;
};

thread module

Because the system call interface is too complex, the thread module completes the thread encapsulation, reduces the complexity of the interface call, improves the readability of the code and improves the reusability of the code

#pragma once
#include <iostream>
#include <string>
#include <vector>
#include <cstdio>

class ThreadDate
{
public:
    void* _args;
    std::string _name;
};

typedef void*(*func_t)(void*);
class Thread
{
public:
    Thread(size_t num,func_t callback,void* args): _function(callback)
    {
        char nameBuffer[64];
        snprintf(nameBuffer, sizeof nameBuffer, "Thread-%d", num);
        _threadDate._name = nameBuffer;
        _threadDate._args = args;
    }
    ~Thread() {}
    void Start() { pthread_create(&_tid, nullptr, _function, (void*)&_threadDate); }
    void Join() { pthread_join(_tid, nullptr); }
    std::string Name() { return _name; }
private:
    std::string _name;
    func_t _function;
    ThreadDate _threadDate;
    pthread_t _tid;
};

Why do mutexes and condition variables need to exist in the thread pool?

  • The task queue in the thread pool is a critical resource that will be accessed by multiple execution flows at the same time, so it is necessary to introduce a mutex to protect the task queue
  • Threads in the thread pool need to fetch tasks from the task queue, but the prerequisite is that there are tasks in the task queue, so threads in the thread pool need to determine whether there are tasks in the task queue before taking tasks. If the task queue is empty at this time, then the thread should wait until there are tasks in the task queue before waking it up, so a condition variable needs to be introduced
  • When an external thread pushes a task to the task queue, some threads may be in a waiting state at this time, so after adding a task, it is necessary to wake up the thread waiting under the condition variable

Notice:

  • When a thread is awakened, it may be abnormal or false awakening, or some broadcast-type awakening thread operation causes all threads to be awakened, so that among the awakened threads, only a few threads can get tasks. At this time, the awakened thread should judge whether the awakened condition is satisfied again, so when judging whether the task queue is empty, you should use while to judge instead of if
  • The function of the pthread_cond_broadcast() function is to wake up all threads under the condition variable, and the external may only push a task, but wake up all the waiting threads. At this time, these threads will go to the task queue to obtain tasks, but in the end only A thread can get tasks. Waking up a large number of threads at once may cause system shocks, known as the thundering herd effect. Therefore, when waking up a thread, it is best to use the pthread_cond_signal() function to wake up a thread that is waiting
  • When a thread gets a task from the task queue, the task belongs to the current thread and has nothing to do with other threads, so the task should be processed after it is unlocked, not before it is unlocked. Because the process of processing tasks may take a certain amount of time, don't put this behavior in the critical section

Why does the thread execution routine in the thread pool need to be set as a static method?

When using the pthread_create() function to create a thread, you need to pass in a Routine (execution routine) for the created thread. The Routine has only one parameter whose parameter type is void*, and the return value whose return type is void*

At this time, Routine is a member function of the class, and the first parameter of the function is the hidden this pointer, so the Routine function here seems to have only one parameter, but actually has two parameters. At this time, it is not possible to directly use the Routine function as the execution routine when creating a thread, and it cannot be compiled.

Static member functions belong to a class, not to an object, that is to say, static member functions do not have a hidden this pointer, so Routine needs to be set as a static method. At this time, the Routine function really only has one parameter whose type is void* parameter

However, non-static member variables cannot be used inside static member functions, so the this pointer of the current object needs to be passed to the Routine function when creating a thread. At this time, the non-static member variables can be called inside the Routine function through this this pointer.

Lazy singleton pattern

There should only be one instance of the thread pool in the entire project, which can be implemented using the singleton mode. The lazy singleton mode is used here, and its core idea is "delayed loading", which can optimize the server startup speed

//示意代码
template <typename T>
class Singleton 
{
    static T* inst;
public:
    static T* GetInstance() {
        if (inst == NULL) {
            inst = new T();
        }     
        return inst;
    }
};

A unique thread pool instance is instantiated only after GetInstance() is called. But there is a serious problem, thread unsafe. When calling GetInstance() for the first time, if multiple threads call at the same time, multiple instances of the T object may be created

template <class T>
class Singleton 
{
    static T* inst;
    static std::mutex lock;
public:
    static T* GetInstance() {
        if (inst == NULL) { // 双重判定空指针, 降低锁冲突的概率, 提高性能.
            lock.lock(); // 使用互斥锁, 保证多线程情况下也只调用一次 new.
            if (inst == NULL) {
                inst = new T();
            } 
            lock.unlock();
        } 
        return inst;
    }
};

You may be a little confused, why do you need to determine whether it is a null pointer twice? Why not write it like this?

template <class T>
class Singleton 
{
    static T* inst;
    static std::mutex lock;
public:
    static T* GetInstance() {
        lock.lock();
        if (inst == NULL) {
            inst = new T();
        } 
        lock.unlock();
        return inst;
    }
};

Because after the first thread creates a unique thread pool instance, there may still be subsequent calls to this function (not to create a unique instance, but to obtain the address of the unique instance), at this time multi-threading will involve competing locks Resources and continuous locking and unlocking result in a waste of time and resources. Using double if judgment can avoid lock competition in some cases (a unique instance already exists), thus improving performance

Lock guard module

RAII style, can avoid throwing an exception in the locked area and cause it to be unlocked, and the scope will be automatically unlocked 

#pragma once
#include <iostream>
#include <pthread.h>

class Mutex
{
public:
    Mutex(pthread_mutex_t *mtx):_pmtx(mtx) {}
    void Lock() { pthread_mutex_lock(_pmtx); }
    void UnLock() { pthread_mutex_unlock(_pmtx); }
    ~Mutex() {}
private:
    pthread_mutex_t *_pmtx;
};

// RAII风格的加锁方式
class LockGuard
{
public:
    LockGuard(pthread_mutex_t *mtx):_mutex(mtx) { _mutex.Lock(); }
    ~LockGuard() { _mutex.UnLock(); }
private:
    Mutex _mutex;
};

Thread pool implementation

#pragma once
#include <iostream>
#include <vector>
#include <string>
#include <queue>
#include <unistd.h>
#include "Thread.hpp"
#include "LockGuard.hpp"
#include "Log.hpp"

//懒汉模式
const int g_threadNum = 3;
template<class T>
class ThreadPool
{
public://为routine()静态函数提供
    pthread_mutex_t *GetMutex() { return &_mutex; }
    bool isEmpty() { return _taskQueue.empty(); }
    void WaitCond() { pthread_cond_wait(&_cond, &_mutex); }
    T GetTask() {
        T task = _taskQueue.front();
        _taskQueue.pop();
        return task;
    }

public:
    //需考虑多线程申请单例的情况
    static ThreadPool<T>* GetThreadPool(int num = g_threadNum)
    {
        if(nullptr == pool_ptr) {
            {
                LockGuard lockguard(&_init_mutex);
                if(nullptr == pool_ptr) {
                    pool_ptr = new ThreadPool<T>(num);
                }
            }
        }
        return pool_ptr;
    }

    static void* Routine(void* args) {
        ThreadDate* thread_date = (ThreadDate*)args;
        ThreadPool<T>* thread_pool = (ThreadPool<T>*)thread_date->_args;
        while(true) {
            T task;
            {
                LockGuard lockguard(thread_pool->GetMutex());
                while(thread_pool->isEmpty()) thread_pool->WaitCond();
                task = thread_pool->GetTask();
            }
            task(thread_date->_name);//仿函数
        }
    }
    void PushTask(const T& task)
    {
        LockGuard lockguard(&_mutex);
        _taskQueue.push(task);
        pthread_cond_signal(&_cond);
    }
    void Run()
    {
        for(auto& iter : _threads) {
            iter->Start(); 
            LogMessage(DEBUG, "%s %s", iter->Name().c_str(), "启动成功");
        }
    }
    ~ThreadPool()
    {
        for (auto &iter : _threads) {
            iter->Join();
            delete iter;
        }
        pthread_mutex_destroy(&_mutex);
        pthread_cond_destroy(&_cond);
    }

private:
    ThreadPool(int threadNum):_num(threadNum) {
        pthread_mutex_init(&_mutex, nullptr);
        pthread_cond_init(&_cond, nullptr);
        for (int i = 1; i <= _num; i++) {
            _threads.push_back(new Thread(i, Routine, this));
        }
    }
    ThreadPool(const ThreadPool<T>& others) = delete;
    ThreadPool<T>& operator= (const ThreadPool<T>& others) = delete;

private:
    std::vector<Thread*> _threads;
    size_t _num;
    std::queue<T> _taskQueue;
private:
    pthread_mutex_t _mutex;
    pthread_cond_t _cond;
private:
    static ThreadPool<T>* pool_ptr;//避免编译器自动优化
    static pthread_mutex_t _init_mutex;
};

template<typename T>
ThreadPool<T>* ThreadPool<T>::pool_ptr = nullptr;
template<typename T>
pthread_mutex_t ThreadPool<T>::_init_mutex = PTHREAD_MUTEX_INITIALIZER;

5. Thread Pool Demonstration

main thread logic

The main thread is responsible for continuously pushing tasks to the task queue, after which the threads in the thread pool will obtain tasks from the task queue and process them 

#include "ThreadPool.hpp"
#include "Task.hpp"

int main()
{
    ThreadPool<Task>* threadPool = ThreadPool<Task>::GetThreadPool(5);
    threadPool->Run();
    while(true)
    {
        //生产的过程,制作任务的时候,要花时间
        int x = rand()%100 + 1;
        usleep(7721);
        int y = rand()%30 + 1;
        Task task(x, y, [](int x, int y)->int{
            return x + y;
        });

        LogMessage(NORMAL, "制作任务完成: %d+%d=?", x, y);
        //推送任务到线程池中
        threadPool->PushTask(task);
        sleep(1);
    }
    return 0;
}

There are six threads immediately after running the code, one of which is the main thread, and the other five are threads processing tasks in the thread pool

The five threads will show a certain sequence when processing, because the main thread pushes a task per second, only one thread among the five threads will get the task, and other threads will wait in the waiting queue. After the thread finishes processing the task, it will be queued to the end of the waiting queue because the task queue is empty. When the main thread pushes a task again, it will wake up a thread at the head of the waiting queue. After the thread finishes processing the task, it will be queued to the end of the waiting queue. , so these five threads will show a certain order when processing tasks (4-1-3-5-2)

Note: If you want the thread pool to handle other different task requests in the future, you only need to provide a task class, and provide the corresponding operator() method in the task class.

Guess you like

Origin blog.csdn.net/GG_Bruse/article/details/129616793