boost library multi-threading (Thread) programming (thread operations, mutex mutex, condition variables)

1 Create a thread

Just as the std::fstream class represents a file, the boost::thread class represents an executable thread. The default constructor creates an instance that represents the current thread of execution. An overloaded constructor takes a function object that takes no arguments and returns no value. This constructor creates a new executable thread that calls that function object.

At first, everyone thought that the traditional C method of creating threads seemed to be more useful than this design, because when C created threads, a void* pointer was passed in, and data can be passed in this method. However, since the Boost thread library uses function objects instead of function pointers, the function objects themselves can carry the data needed by the thread. This approach is more flexible and type-safe. When used with a library like Boost.Bind, this method allows you to pass any amount of data to the newly created thread.

Currently, thread objects created by the Boost threading library are not very powerful. In fact it can only do two operations. Thread objects can be easily compared using == and != to determine whether they represent the same thread; you can also call boost::thread::join to wait for the thread to finish executing. Some other threading libraries let you do other things with the thread (like setting the priority, or even canceling the thread). However, since it is not trivial to add these operations to a portable interface, discussions are still ongoing on how to add these operations to the Boost threading library.

Listing 1 shows one of the simplest usages of the boost::thread class. The newly created thread simply prints "hello, world" on std::out, and the main function ends after it has finished executing.

#include <boost/thread/thread.hpp>
#include <iostream>

void hello()
{
        std::cout <<
        "Hello world, I'm a thread!"
        << std::endl;
}

int main(int argc, char* argv[])
{
        boost::thread thrd(&hello);
        thrd.join();
        return 0;
}

2 Mutex

Anyone who has written a multithreaded program knows the importance of avoiding simultaneous access to shared areas by different threads. If one thread were to change some data in the shared area while another thread was reading the data, the result would be undefined. To avoid this from happening some special primitive types and operations are used. The most basic of which is the mutual exclusion (mutex, short for mutual exclusion). A mutex allows only one thread to access the shared area at a time. When a thread wants to access the shared area, the first thing to do is to lock the mutex. If another thread has locked the mutex, it must first wait for that thread to unlock the mutex, which ensures that only one thread can access the shared area at a time.

There are many variants of the concept of a mutex. The Boost threading library supports two types of mutexes, including simple mutex and recursive mutex. If the same thread locks the mutex twice, a deadlock will occur, which means that all threads waiting to unlock will wait forever. With a recursive mutex, a single thread can lock the mutex multiple times, and of course it must unlock the same number of times to ensure that other threads can lock the mutex.

Within these two categories of mutexes, there are several variants of how threads are locked. A thread can lock a mutex in three ways:

Wait until no other thread locks the mutex.
Returns immediately if another mutex has locked the mutex.
Wait until no other threads lock the mutex until timeout.

It seems that the best type of mutex is a recursive mutex, which can use all three forms of locking. However each variant comes at a price. So the Boost threading library allows you to use the most efficient mutex type for different needs. The Boost thread library provides 6 types of mutex, which are sorted by efficiency:

boost::mutex,
boost::try_mutex,
boost::timed_mutex,
boost::recursive_mutex,
boost::recursive_try_mutex,  
boost::recursive_timed_mutex

A deadlock occurs if the mutex is locked and not unlocked. This is a common mistake, and the Boost threading library makes it impossible (at least difficult). Directly locking and unlocking a mutex is not possible for users of the Boost threading library. The mutex class implements the locking and unlocking of the mutex by defining the type implemented in RAII by teypdef. This is also known as the Scope Lock mode. To construct these types, pass in a reference to a mutex. The constructor locks the mutex, and the destructor unlocks the mutex. C++ guarantees that the destructor will always be called, so even if an exception is thrown, the mutex will always be unlocked correctly.

This method ensures correct use of mutexes. However, it is important to note that although the Scope Lock mode can guarantee that the mutex is unlocked, it does not guarantee that the contributed resources will still be available after the exception is thrown. So just like executing a single-threaded program, it must be guaranteed that exceptions do not lead to abnormal program state. Also, this locked object cannot be passed to another thread, because the state they maintain doesn't prohibit doing so.

List2 gives the simplest example of using boost::mutex. In the example, two new threads are created, and each thread has 10 loops. The thread id and the current number of loops are printed on std::cout, and the main function waits for the two threads to finish executing before ending. std::cout is a shared resource, so each thread uses a global mutex to ensure that only one thread can write to it at a time.

Many readers may have noticed that passing data to the thread in List2 also requires writing a function manually. Although this example is simple, it would be tedious to have to write code like this every time. Don't worry, there is an easy solution. A function library allows you to create a new function by binding another function and passing in the data it needs to call. List3 shows you how to use the Boost.Bind library to simplify the code in List2 so that you don't have to write these function objects by hand.

#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>

boost::mutex io_mutex;

struct count
{
        count(int id) : id(id) { }
        
        void operator()()
        {
                for (int i = 0; i < 10; ++i)
                {
                        boost::mutex::scoped_lock
                        lock(io_mutex);
                        std::cout << id << ": "
                        << i << std::endl;
                }
        }
        
        int id;
};

int main(int argc, char* argv[])
{
        boost::thread thrd1(count(1));
        boost::thread thrd2(count(2));
        thrd1.join();
        thrd2.join();
        return 0;
}
Example 3: // This example is the same as Example 2, except that Boost.Bind is used to simplify creating threads to carry data and avoid using function objects

#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/bind.hpp>
#include <iostream>

boost::mutex io_mutex;

void count(int id)
{
        for (int i = 0; i < 10; ++i)
        {
                boost::mutex::scoped_lock
                lock(io_mutex);
                std::cout << id << ": " <<
                i << std::endl;
        }
}

int main(int argc, char* argv[])
{
        boost::thread thrd1(
        boost::bind(&count, 1));
        boost::thread thrd2(
        boost::bind(&count, 2));
        thrd1.join();
        thrd2.join();
        return 0;
}

3 Condition variables

Sometimes just locking a shared resource to use it is not enough. Sometimes shared resources are only available in certain states. For example, if a thread wants to read data from the stack, it must wait for data to be pushed on the stack if there is no data on the stack. Synchronization in this case using a mutex is not enough. Another form of synchronization, condition variables, can be used in this case.

The use of condition variables is always associated with mutexes and shared resources. The thread first locks the mutex and then checks whether the state of the shared resource is available for use . If not, then the thread waits on the condition variable. To point to such an operation, the mutex must be unlocked while waiting so that other threads can access the shared resource and change its state. It also has to ensure that the mutex is locked when returning from the waiting thread . When another thread changes the state of the shared resource, it notifies the thread waiting on the condition variable and returns it to the waiting thread .

List4 is a simple example using boost::condition. There is a class that implements a bounded buffer and a fixed-size first-in, first-out container. This buffer is thread-safe due to the use of the mutex boost::mutex. put and get use condition variables to ensure that the thread waits for the state necessary to complete the operation. Two threads are created, one puts 100 integers in the buffer and the other takes them out of the buffer. This bounded cache can only hold 10 integers at a time, so the two threads must periodically wait for the other thread. To verify this, put and get output diagnostic statements in std::cout. Finally, when the two threads are finished, the main function is executed.

#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition.hpp>
#include <iostream>

const int BUF_SIZE = 10;
const int ITERS = 100;

boost::mutex io_mutex;

class buffer
{
        public:
        typedef boost::mutex::scoped_lock
        scoped_lock;
        
        buffer()
        : p(0), c(0), full(0)
        {
        }
        
        void put(int m)
        {
                scoped_lock lock(mutex);
                if (full == BUF_SIZE)
                {
                        {
                                boost::mutex::scoped_lock
                                lock(io_mutex);
                                std::cout <<
                                "Buffer is full. Waiting..."
                                << std::endl;
                        }
                        while (full == BUF_SIZE)
                        cond.wait(lock);
                }
                buf[p] = m;
                p = (p+1) % BUF_SIZE;
                ++full;
                cond.notify_one();
        }
        
        int get()
        {
                scoped_lock lk(mutex);
                if (full == 0)
                {
                        {
                                boost::mutex::scoped_lock
                                lock(io_mutex);
                                std::cout <<
                                "Buffer is empty. Waiting..."
                                << std::endl;
                        }
                        while (full == 0)
                        cond.wait(lk);
                }
                int i = buf[c];
                c = (c+1) % BUF_SIZE;
                --full;
                cond.notify_one();
                return i;
        }
        
        private:
        boost::mutex mutex;
        boost::condition cond;
        unsigned int p, c, full;
        int buf[BUF_SIZE];
};

buffer buf;

void writer()
{
        for (int n = 0; n < ITERS; ++n)
        {
                {
                        boost::mutex::scoped_lock
                        lock(io_mutex);
                        std::cout << "sending: "
                        << n << std::endl;
                }
                buf.put(n);
        }
}

void reader()
{
        for (int x = 0; x < ITERS; ++x)
        {
                int n = buf.get();
                {
                        boost::mutex::scoped_lock
                        lock(io_mutex);
                        std::cout << "received: "
                        << n << std::endl;
                }
        }
}

int main(int argc, char* argv[])
{
        boost::thread thrd1(&reader);
        boost::thread thrd2(&writer);
        thrd1.join();
        thrd2.join();
        return 0;
}

4 thread local storage

Most functions are not reentrant. This means that when a thread has already called a function, if you call the same function again, then it is not safe. A non-reentrant function saves static variables or returns a pointer to static data through successive calls. For example, std::strtok is not reentrant because it uses static variables to hold the string to be split into symbols.

There are two ways to make non-reusable functions reusable. The first method is to change the interface, and replace the original use of static data with pointers or references. For example, POSIX defines strok_r, a reentrant variable in std::strtok that replaces static data with an extra char** parameter. This method is simple and provides the best possible results. But this must change the public interface, which means that the code must be changed. Another approach does not change the public interface, but replaces static data with thread local storage (sometimes called thread-specific storage).

The Boost thread library provides the smart pointer boost::thread_specific_ptr to access thread local storage. When each thread uses an instance of this smart pointer for the first time, its initial value is NULL, so it must first check whether it is empty and assign a value to it. The Boost threading library guarantees that the data saved in the local storage thread will be cleared after the thread ends.

List5 is a simple example using boost::thread_specific_ptr. Two threads are created to initialize the local storage thread, and there are 10 loops, each time incrementing the value pointed to by the smart pointer and outputting it to std::cout (since std::cout is a shared resource, so synchronization via mutex). The main thread waits for these two threads to finish before exiting. It is clear from the output of this example that each thread handles its own data instance, even though they all use the same boost::thread_specific_ptr.

#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/tss.hpp>
#include <iostream>

boost::mutex io_mutex;
boost::thread_specific_ptr<int> ptr;

struct count
{
        count(int id) : id(id) { }
        
        void operator()()
        {
                if (ptr.get() == 0)
                ptr.reset(new int(0));
                
                for (int i = 0; i < 10; ++i)
                {
                        (* ptr) ++;
                        boost::mutex::scoped_lock
                        lock(io_mutex);
                        std::cout << id << ": "
                        << *ptr << std::endl;
                }
        }
        
        int id;
};

int main(int argc, char* argv[])
{
        boost::thread thrd1(count(1));
        boost::thread thrd2(count(2));
        thrd1.join();
        thrd2.join();
        return 0;
}

5 Routines that run only once

There's still one problem left unresolved: how to make initialization work (such as constructors) also thread-safe. For example, if a reference program is to generate a unique global object, due to the order of instantiation, a function will be called to return a static object, and it must ensure that the static object is generated the first time it is called. . The problem here is that if multiple threads call this function at the same time, the constructor of this static object will be called multiple times, so an error occurs.

The solution to this problem is the so-called "once routine". An "implementation once" can only be executed once in an application. If multiple threads want to perform this operation at the same time, only one will actually perform the operation, and the other threads must wait for the operation to complete. To ensure that it is executed only once, the routine is called indirectly by another function, which passes it a pointer and a special flag indicating whether the routine has been called. This flag is initialized statically, which ensures that it is initialized at compile time rather than runtime. So there is no problem of multiple threads initializing it at the same time. The Boost thread library provides boost::call_once to support "one-time implementation", and defines a flag boost::once_flag and a macro BOOST_ONCE_INIT that initializes this flag.

List6 is an example that uses boost::call_once. It defines a static global integer with an initial value of 0; there is also a static boost::once_flag instance initialized by BOOST_ONCE_INIT. The main function creates two threads, both of which want to initialize the global integer by passing in a function called boost::call_once, which increments it by 1. The main function waits for the two threads to finish and outputs the final result to std::cout. It can be seen from the final result that this operation is indeed executed only once, because its value is 1.

#include <boost/thread/thread.hpp>
#include <boost/thread/once.hpp>
#include <iostream>

int i = 0;
boost::once_flag flag =
BOOST_ONCE_INIT;

void init()
{
        ++i;
}

void thread()
{
        boost::call_once(&init, flag);
}

int main(int argc, char* argv[])
{
        boost::thread thrd1(&thread);
        boost::thread thrd2(&thread);
        thrd1.join();
        thrd2.join();
        std::cout << i << std::endl;
        return 0;
}