3.1 Problems with sharing data between threads

Problems with sharing data between threads

On the whole, all problems with data sharing between threads are caused by modifying data. If all shared data is read-only, there is no problem because the data read by one thread is not affected by whether another thread is reading the same data. However, if the data is shared between threads, and one or more threads start modifying the data at the same time, there can be a lot of trouble. In this case, you have to be careful to make sure everything is okay.

One concept that is widely used to help programmers reason about code is invariants—statements that are always true for a particular data structure, such as "This variable contains the number of items in a list." These invariants are often updated during updates is broken, especially when the data structure is complex or when the update requires modifying more than one value.

Consider a doubly linked list, where each node holds a pointer to the next node and the previous node in the list. One of the invariants is that if you follow the "next" pointer from one node (A) to another node (B), the "previous" pointer of that node (B) points back to the previous node (A). In order to delete a node from the table, nodes on both sides must be updated to point to each other. Once one of them is updated, the invariant is broken until the node on the other side is also updated, and when the update is complete, the invariant is held again.

The steps for deleting an entry from such a table are shown in Figure 3.1.
Insert image description here
1) Identify the node (N) to be deleted.
2) Update the link from the previous node of N to N to point to the subsequent node of N.
3) Update the link from the next node of N to N to point to the previous node of N.
4) Delete node N.

As you can see, between steps b and c, the links in one direction are inconsistent with the links in the opposite direction, and the invariants are broken.

The simplest potential problem with modifying data shared between threads is breaking invariants. If you don't do anything special to ensure otherwise, if one thread is reading a doubly linked list and another thread is deleting a node, there is a good chance that the reading thread will see a list with only a partially deleted node ( Because in step b of Figure 3.1, only one of the links was changed), the invariant is broken. The consequences of invariant corruption may vary, if another thread simply reads the linked list items from left to right in the graph, it will skip the node being deleted. On the other hand, if another thread attempts to delete the rightmost node in the graph, it may end up permanently corrupting the data structure and crash the program. Whatever the outcome, this is an example of one of the most common causes of bugs in concurrent code: race conditions .

race conditions

Suppose you buy a ticket to watch a movie at a cinema. If it's a large movie theater, there will be multiple cashiers collecting payments, so more than one person can buy tickets at the same time. If someone else buys tickets to the same movie as you at another checkout, the seats available to you will depend on whether the other person or you actually ordered first. This difference could be critical if there are only a few seats left. It can literally be seen as a competition to see who gets the last movie ticket. This is an example of a race condition : which seat you get (or even whether you get a ticket) depends on the relative order of two purchases.

In concurrency, a race condition is anything in which the outcome depends on the relative order in which operations are performed on two or more threads. Threads compete to perform their respective operations. In most cases, this is relatively benign, since all possible outcomes are acceptable, although they may change with different relative orders. For example, if two threads are both adding items to a queue to be processed, it generally does not matter which item is added first while maintaining system invariants. Problems arise when race conditions cause invariants to be corrupted, as in the doubly linked list just mentioned. When talking about concurrency, the term race condition is often used to refer to a problematic race condition. Healthy competitive conditions are neither interesting nor a bad inducement. The C++ standard also defines the term data race to mean a specific type of race condition that results from concurrent modifications to a single object. Data races cause horrific undefined behavior .

Problematic race conditions usually occur where two or more different blocks of data need to be modified to complete the operation, such as the two linked list pointers in the example. Because the operation must access two separate pieces of data, this must be modified in separate instructions, and while only one of the instructions completes, it is possible for another thread to access this data structure. Competitive conditions are often difficult to find and difficult to replicate because the window of opportunity is small. If these modifications are done as consecutive CPU instructions, the chance of problems manifesting in any one run is very small, even if the data structure is being accessed concurrently by another thread. As the load on the system increases, and the number of times the operation is performed increases, the chance of problematic execution sequences increasing. It's almost inevitable that this kind of problem will surface at the most inconvenient time. Since race conditions are generally time-sensitive, they often disappear completely when the application is run under a debugging tool that affects the program's timing, even if only slightly.

If you are writing multi-threaded programs, race conditions can easily become a disaster in your life. A lot of the complexity in writing software that uses concurrency comes from avoiding problematic race conditions.

Avoid problematic race conditions

There are several ways to handle problematic race conditions. The simplest option is to encapsulate your data structure with a protection mechanism to ensure that only the thread actually performing the modification can see the intermediate data where the invariant is broken. From the perspective of other threads accessing the data structure, this modification has either not yet started or has been completed. The C++ standard library provides some such mechanisms, all of which are described in this chapter.

Another option is to modify the design of the data structure and its invariants so that the modification is accomplished as an indivisible series of changes, each of which preserves its invariants. This is often called lock-free programming , and is difficult to perfect. If you work at this level, the subtle differences in memory models and identifying which threads may see which set of values ​​can get complicated.

Another way to handle race conditions is to handle updates to the data structure as a transaction , just as updates to the database are done within a transaction. The required series of data modifications and reads are stored in a transaction log and then committed in a single step. If the commit cannot proceed because the data structure has been modified by another thread, the transaction will be restarted. This is called software transactional memory (STM) , and it is an active area of ​​research at the time of writing. This will not be covered in this book because there is no direct support for STM in C++. However, the basic idea of ​​doing something privately and then committing it in a single step, I will mention later.

The most basic mechanism for protecting shared data provided by the C++ standard is the mutex , so let's take a look first.

Guess you like

Origin blog.csdn.net/qq_36314864/article/details/132184562