C++ Concurrent Programming (4): Problems with shared data, using mutexes to protect shared data, deadlock

Share data between threads

reference blog

Sharing data between threads - using mutexes to protect shared data

[c++11] Multi-threaded programming (4) - deadlock (Dead Lock)

c++ multithreading deadlock

C++ deadlock and its solution

The problem with sharing data

Imagine that you and your friends share an apartment for a while, and the apartment only has a kitchen and a bathroom. It's impossible to use the bathroom at the same time unless you're extremely emotionally attached. Also, it can be inconvenient if a friend occupies the bathroom for a long time and you happen to need it too. Similarly, if you have a combination oven, although you can cook at the same time, if one person is baking sausages while the other is baking a cake, the result should not be very good. Moreover, we are also aware of the troubles of shared office space: things are not finished, but someone borrows what is needed for work, or the semi-finished product is changed by others without authorization

The same goes for threads. If data is shared between threads, we need to follow the specification: which thread accesses which data in which way; also, once the data is changed, if other threads are involved, when and how should they be notified. Although data can be easily shared among multiple threads in the same process, this is not an absolute advantage, and sometimes it is even a great disadvantage. Improper use of shared data is a big contributor to concurrency-related bugs, far worse than "sausage cake"

vicious conditional competition

A typical scenario that induces vicious conditional competition is that to complete an operation, two or more different pieces of data need to be changed, such as the two link pointers in the above example. Because the operation involves two separate pieces of data that can only be changed with a single instruction, 当其中一份数据完成改动时,别的线程有可能不期而访. Conditional competition is often difficult to detect and reproduce because of the short time window in which the condition is met. If the altering operation is done in a continuous stream of uninterrupted CPU instructions, there is less chance of causing problems in any single run, even if other threads are accessing the data concurrently. Problems can only arise if instructions are executed in a certain order. The chances of this sequence increasing as the system load increases and the number of operations performed increases. "House leaks happen to rain overnight" is almost unavoidable, and these problems will appear under the most inopportune circumstances. Vicious race conditions are generally "picky" when they appear, and they often disappear completely when the application is run in a debug environment, because the debugging tool affects the internal execution timing of the program, even if only a little bit

Protect shared data with mutexes

Before accessing shared data, developers can use a mutex to lock related data, and unlock the data after the access is complete. Therefore, the thread library needs to ensure that when a thread locks shared data using a specific mutex, other threads can only access it after the data is unlocked

lock(), unlock() lock and unlock

In C++, a mutex is created by instantiating std::mutex, locked by calling the member function lock(), and unlocked by unlock()

In practice, it must be used in pairs. Once lock is used in a function, unlock must be called at the exit of the function.

mutex my_mutex;
int a = 1;
bool func()
{
    
       
    my_mutex.lock();
    if(!a)
    {
    
    
        cout << "a = " << a << endl;
        my_mutex.unlock();
        return false;
    }
    my_mutex.unlock();
    return true;    
}

Note that the above code calls unlock before return

RAII std::lock_guard

The C++ standard library provides a RAII syntax template class std::lock_guard for mutexes, which will be in 构造的时候提供已锁的互斥量,并在析构的时候进行解锁, thus ensuring that a locked mutex will always be unlocked correctly

mutex my_mutex;
int a = 1;
bool func()
{
    
       
    lock_guard<mutex> my_guard(my_mutex);
    // my_mutex.lock();
    if(!a)
    {
    
    
        cout << "a = " << a << endl;
        // my_mutex.unlock();
        return false;
    }
    // my_mutex.unlock();
    return true;    
}

Locks can be released early by limiting the scope of lock_guard

mutex my_mutex;
int a = 1;
bool func()
{
    
    
    // lock_guard<mutex> my_guard(my_mutex);
    // my_mutex.lock();
    if (!a)
    {
    
    
        {
    
    
            lock_guard<mutex> my_guard(my_mutex);
            cout << "a = " << a << endl;
        }
        // my_mutex.unlock();
        return false;
    }
    // my_mutex.unlock();
    return true;
}

Object-oriented design guidelines : put the mutex as a data member in the class

It should be noted that the mutex and the data to be protected need to be defined as private members in the class, and all member functions need to lock the data when they are called and unlock the data at the end, so as to ensure that the data will not be destroyed

The reality is not always so ideal. It should be realized that if a member function returns a pointer or reference to protected data, then there must be a possibility of data corruption. The reason is使用者可以通过引用或指针直接访问数据,从而绕开互斥量的保护 . Therefore, if a class uses a mutex to protect its own data members, its developer must 谨小慎微地设计接口ensure that the mutex locks any access to the data and does not leave a back door

class some_data
{
    
    
  int a;
  std::string b;
public:
  void do_something();
};

class data_wrapper
{
    
    
private:
  some_data data;
  std::mutex m;
public:
  template<typename Function>
  void process_data(Function func)
  {
    
    
    std::lock_guard<std::mutex> l(m);
    func(data);    // 1 传递“保护”数据给用户函数
  }
};

some_data* unprotected;

void malicious_function(some_data& protected_data)
{
    
    
  unprotected=&protected_data;
}

data_wrapper x;
void foo()
{
    
    
  x.process_data(malicious_function);    // 2 传递一个恶意函数
  unprotected->do_something();    // 3 在无保护的情况下访问保护数据
}

It seems that there is nothing wrong with process_data, but calling user-defined func means that foo can bypass the protection mechanism and pass in the function malicious_function, calling do_something without mutex locking

The C++ Standard Library provides no protection against this behavior, so it is important to keep in mind:切勿将受保护数据的指针或引用传递到互斥锁作用域之外

deadlock

First use a basic example to abstract what a deadlock is:

Interviewer: "If you can explain clearly what deadlock is, I will send you an offer"
Candidate: "If you can send me an offer, I will tell you what deadlock is"

A deadlock is a scenario where there is a pair of threads, and they both need to perform some operations that start with locking their own mutex and require the other party to release the mutex it holds. In this scenario No thread is able to work properly because they are all waiting for each other to release the mutex. Deadlock can easily occur when there are more than two mutexes locking the same operation

For example, if there is a thread A at this time, it will lock a first, then lock b, and at the same time there is another thread B, which will acquire the lock in the order of first locking b and then locking a. As shown below:

insert image description here

In this situation as shown in the figure, thread A is waiting for lock b, but lock b is locked, so it cannot proceed at this time, it needs to wait for lock b to be released, and thread B has locked lock b first, and is waiting for lock b The release of a causes thread A to wait for thread B, and thread B to wait for thread A, resulting in a deadlock

#include <iostream>
#include <thread>
#include <string>
#include <mutex>
#include <fstream>
using namespace std;

class LogFile {
    
    
    std::mutex _mu;
    std::mutex _mu2;
    ofstream f;
public:
    LogFile() {
    
    
        f.open("log.txt");
    }
    ~LogFile() {
    
    
        f.close();
    }
    void shared_print(string msg, int id) {
    
    
        std::lock_guard<std::mutex> guard(_mu);
        std::lock_guard<std::mutex> guard2(_mu2);
        f << msg << id << endl;
        cout << msg << id << endl;
    }
    void shared_print2(string msg, int id) {
    
    
        std::lock_guard<std::mutex> guard(_mu2);
        std::lock_guard<std::mutex> guard2(_mu);
        f << msg << id << endl;
        cout << msg << id << endl;
    }
};

void function_1(LogFile& log) {
    
    
    for(int i=0; i>-100; i--)
        log.shared_print2(string("From t1: "), i);
}

int main()
{
    
    
    LogFile log;
    std::thread t1(function_1, std::ref(log));

    for(int i=0; i<100; i++)
        log.shared_print(string("From main: "), i);

    t1.join();
    return 0;
}

After running, you will find that the program will get stuck, which is a deadlock. When the program is running, something similar to the following may occur:

Thread A              Thread B
_mu.lock()          _mu2.lock()
   //死锁               //死锁
_mu2.lock()         _mu.lock()

Common Deadlock Situations

1. Forgot to release the lock

mutex _mutex;
void func()
{
    
    
	_mutex.lock();
	if (xxx)
	  return;
	_mutex.unlock();
}

2. Single-threaded repeated application for locks

mutex _mutex;
void func()
{
    
    
	_mutex.lock();
	 //do something....
	_mutex.unlock();
}
 
void data_process() {
    
    
	_mutex.lock();
	func();
	_mutex.unlock();
}

3. Dual-threaded multi-lock application

mutex _mutex1;
mutex _mutex2;
 
void process1() {
    
    
	_mutex1.lock();
	_mutex2.lock();
	//do something1...
	_mutex2.unlock();
	_mutex1.unlock();
}
 
void process2() {
    
    
	_mutex2.lock();
	_mutex1.lock();
	//do something2...
	_mutex1.unlock();
	_mutex2.unlock();
}

4. Ring lock application

/*
*             A   -  B
*             |      |
*             C   -  D
*/

Solution to deadlock

1. For mutexaddresses that can be compared, lock the one with the smaller address first every time

If there is a hard condition that we have to acquire multiple locks and std::lock cannot be used, then the guideline is that each thread is guaranteed to acquire the locks in the same order

if(&_mu < &_mu2){
    
    
    _mu.lock();
    _mu2.unlock();
}
else {
    
    
    _mu2.lock();
    _mu.lock();
}

2. Try to lock only one mutex at the same time

{
    
    
 std::lock_guard<std::mutex> guard(_mu2);
 //do something
    f << msg << id << endl;
}
{
    
    
 std::lock_guard<std::mutex> guard2(_mu);
 cout << msg << id << endl;
}

3. Do not use user-defined code in the area protected by the mutex, because the user's code may operate other mutexes

{
    
    
 std::lock_guard<std::mutex> guard(_mu2);
 user_function(); // never do this!!!
    f << msg << id << endl;
}

4. If you want to lock multiple mutexes at the same time, usestd::lock()

std::lock(_mu, _mu2);

5. Use hierarchical locks

Use hierarchical locks, wrap the mutex, define a layered attribute for the lock, and lock each time in order from high to low

When the code tries to perform a locking operation, it checks whether it currently holds a lock from the lower level, and if so, prohibits locking the current mutex

hierarchical_mutex high_level_mutex(10000);
hierarchical_mutex low_level_mutex(5000);

int do_low_level_stuff();

int low_level_func() {
    
    
  std::lock_guard<hierarchical_mutex> lk(low_level_mutex);
  return do_low_level_stuff(); 
}

void high_level_stuff(int some_param);

void high_level_func() {
    
    
  std::lock_guard<hierarchical_mutex> lk(high_level_mutex);
  high_level_stuff(low_level_func());
}

void thread_a() {
    
    
  high_level_func(); 
}

hierarchical_mutex other_mutex(100); 
void do_other_stuff();

void other_stuff() {
    
    
  high_level_func();
  do_other_stuff(); 
}

void thread_b() {
    
    
  std::lock_guard<hierarchical_mutex> lk(other_mutex);
  other_stuff(); 
}

thread_a obeys the hierarchy rules, but thread_b does not. It can be noticed that thread_a calls high_level_func, so the high-level mutex high_level_mutex is locked, and then tries to call low_level_func. At this time, the low-level mutex low_level_mutex is locked, which is consistent with the above-mentioned rules: first Lock high level and then lock low level

The operation of thread_b is not so optimistic. It first locks the other_mutex with a level of 100, and then tries to lock the high-level high_level_mutex. At this time, an error will occur, an exception may be thrown, or the program may be terminated directly

Guess you like

Origin blog.csdn.net/Solititude/article/details/131738758