Multithreading technology in Qt

1. The origin of the thread concept

1.1 Single-core CPU

In the early single-core CPU era, there was no concept of threads, only processes. As a large "software", the operating system coordinates the orderly work of various hardware (such as CPU, memory, hard disk, and network card lights). Before the birth of dual-core CPUs, the Windows operating system we used could still write documents in Word and listen to music at the same time. As the only CPU in the entire system that could complete computing tasks, how could it ensure that the two processes were "running at the same time"? Time slice rotation scheduling !

Note the keyword "rotation". Each process will be assigned a time slice by the operating system, which is the time it takes each time it is selected by the CPU to execute the current process. When the time is up, regardless of whether the process ends or not, the operating system will force the CPU resource to be transferred to another process for execution . Why do we do this? Because there is only a single-core CPU, if there is no such round-robin scheduling mechanism, should it handle the process of writing documents or the process of listening to music? No matter which process is executed, the other process will definitely not be executed, and the program will naturally be in a non-running state. If the CPU processes the word process for a while and the music listening process for a while, it may seem like both processes are stuck at first, but the execution speed of the CPU is so fast that people don't feel the frustration of switching. It really seems like Two processes are "running in parallel".

 

As shown in the figure above, each small square is a time slice, about 100ms. Assume that I have Word, QQ, and NetEase Cloud Music running at the same time. The CPU will first process the Word process. Once the 100ms time is up, the CPU will be forced to switch to the QQ process. After processing for 100ms, it will switch to the NetEase Cloud Music process. After 100ms, the Word process is processed again, and the process is switched back and forth. Let's take Word out of it and look at it separately. If the time slice is small enough, then it will look like the last processing process based on human reaction speed, and it will look like the illusion that "the CPU only processes the Word process" . With the development of chip technology, the processing speed of CPU is getting faster and faster, and more and more processes can be run simultaneously while ensuring smooth operation.

1.2 Multi-core CPU

As more and more processes are running, people find that there is a large time and space overhead in the creation, cancellation, and switching of processes. Therefore, the industry urgently needs a lightweight process technology to reduce the overhead. So in the 1980s, a symmetrical multi-processing technology called SMP (Symmetrical Multi-Processing) appeared, which is what we know as the thread concept. The overhead of thread switching is much smaller. This is because each process has its own complete virtual address space, and the thread belongs to a certain process and shares this address space with other threads in the process. Basically, it can Utilizing the resources owned by the process without calling new resources, the overhead of scheduling it will be much smaller.

 

Take the QQ chat software as an example. Above, we have been talking about how different processes run smoothly. At this moment, we only focus on the operation of one process. Without the emergence of threading technology, when the QQ process is "failed" by the CPU, should I process the chat or the interface refresh? If you only process chat, the interface will not refresh, and it seems that the interface is stuck. With threading technology, the CPU executes 100ms each time, of which 30ms is used to process chat, 40ms is used to process file transfers, and the remaining 30ms is used to process interface refresh, so that each component can run in "parallel" . So we can extract two applicable scenarios for multi-threading:

  • Increase processing speed by using multi-core CPUs.
  • This ensures that the GUI interface runs smoothly while performing other computing tasks.

1.3 Thread life cycle

Here is a brief look at the process from creation to exit of a thread. The first is to " create " a new thread and wait for the CPU to execute; when the CPU comes to execute, if the thread needs to wait for another event to be executed before it can execute, then the thread is in a "blocked" state at this time; if not If you need to wait for other events, the thread can be " run ", which can also be said to be occupying the time slice; after the time slice is used up, the thread will be in the " ready " state, waiting for the next time slice to arrive; after all tasks are completed , the thread will enter the " exit " state, and the operating system will release the resources allocated by the thread.

2. Challenges in accessing data

2.1 Interrupt operation

Since there are interrupt operations during the time slice rotation, very interesting phenomena will occur when accessing certain data. To better understand reentrancy and thread safety, let’s start with their origins.

Beginning in the mid-1960s, computer systems entered the third generation of development, and a large number of CPUs with complete functions and high integration flooded into the market. In addition to the accelerated CPU speed, computers during this period also had interrupt devices, input and output channels, etc. Due to the rapid development of these technologies, sharing computer hardware devices by multiple processes has become the core of operating system research. The time slice rotation scheduling mechanism can solve this problem very well. The operating system with this function is called a "multitasking operating system", such as the well-known Windows. In addition, operating systems in embedded devices also have similar technologies. Their common feature is that the CPU processes process A for a while and process B for a while, and there will be interrupt operations during the task switching process .

The following code can intuitively feel what the interrupt operation is like:

#include <QCoreApplication>
#include <QDebug>
#include <iostream>
#include <csignal>
#include <unistd.h>

using namespace std;

void signalHandler( int signum )
{
    qDebug() << "Receiv signal (" << signum << ").";
}

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    signal(SIGINT, signalHandler);
    while (1) {
        qDebug() << "Go to sleep...";
        sleep(1);
    }
    return a.exec();
}

 

The above code is very simple. If you want the operating system to interrupt a process, you must pass the interrupt signal to this process. The function of signal(SIGINT, signalHandle) is that once an interrupt (SIGINT, a type of interrupt signal) is generated, the process The signalHandle() function will be executed. During the running of the process, we manually press Ctrl+C to artificially generate an interrupt signal. At this time, the signalHandle() function will be executed, that is, the "Receiv signal..." message will be output.

Interrupting operations will cause the reentrant problem we will talk about next.

2.2 Reentrancy problem

The reentrancy problem was born in a multi-tasking environment. At this time, there was no such thing as multi-threading . Reentrant problems are encountered many times in embedded systems and real-time operating systems. For example, the MCU (also called a single-chip microcontroller chip) in a certain hardware product has many sensors, such as acceleration, light sensing, and gyroscopes. wait. These sensors are public resources, and any process has the right to access their data. These data are called "global variables", that is, global_value .

Assume that the programmer of this product uses timing and interrupt methods to read the data on these sensors, and writes two programs (process A and process B) to modify a certain global variable global_value . Among them, process A needs to modify global_value 5 times, while process B directly assigns the value once. There will be a situation where an interrupt signal occurs in the system when process A has just been modified for the third time, and the CPU is forced to be scheduled to execute process B. After process B modifies the global_value value, the CPU returns to process A again and continues to run the code for the third time. Modify the value of global_value four times . However, at this time, the global_value faced by process A is no longer the value before it left, so this code is not reentrant . If this code is placed in a certain function, this function is said to be a non-reentrant function.

Unexpected results may also occur due to accessing global variables in a single process, as shown in the following code. Whether an interrupt occurs during normal operation or not will lead to different results.

#include <QCoreApplication>
#include <QDebug>
#include <iostream>
#include <csignal>
#include <unistd.h>

using namespace std;
int global_value = 0;

void signalHandler( int signum )
{
    int i = 0;
    while (i++ < 5) {
        ++global_value;
        qDebug() << "Global value is " << global_value;
        sleep(1);
    }
}

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    signal(SIGINT, signalHandler);
    signalHandler(2);
    qDebug() << "The value is " << global_value;
    
    return a.exec();
}

 

Therefore, a reentrant function is a function that is interrupted during execution and the internal data will not change under any circumstances when the execution resumes to the breakpoint .

2.3 Thread safety

The development of computers has entered the era of multi-threading. Since threads are lightweight processes, interrupts generated when switching threads will also cause the same problem.

As mentioned above, unlike processes, multiple threads share the address space of the same process, and the context environment is saved before and after switching. The question is, since the context environment is saved, why does the dependent environment change? This is because the context saved during an interrupt is limited to a small amount of context such as return addresses and registers, and global variables, static variables, caches, etc. used inside the function are not protected. Therefore, if these values ​​change while the function is interrupted, the results may be unpredictable when the function returns to the breakpoint and continues execution.

Therefore, reentrant scenarios mostly exist when multiple processes call at the same time , while thread safety mostly exists when multiple threads call at the same time .

Faced with these problems encountered in accessing public data, it is obvious that access to it must be serialized, that is, thread A must perform 123 steps at an atomic level before thread B can perform the same steps. To achieve this, the usual approach is to add mutex locks, semaphores and other coordinated thread synchronization operations.

3. Multi-threaded operations provided by Qt and their applicable scenarios

3.1 Thread class

The first is the QThread class, which is the basis of all thread classes. This class provides many low-level APIs to operate threads . Each QThread object represents a thread. There are generally two ways to use this class to open a new thread and run a certain piece of code: (1) Call QObject's moveToThread() function to move the QObject object to the newly opened QThread thread object, so that all time-consuming operations in the QObject object will be executed in the new thread; (2) Inherit QThread and rewrite the run() function, and put the time-consuming operation code into this function for execution. In addition, there is also the QThreadStorage class used to store the data of the main thread. Of course, this is an auxiliary class, and whether it is used depends on the design idea of ​​the product. For details, please refer to "Qt Multi-threaded Programming: Knocking on the Door of QThread Class".

We said above that "there is a large time and space overhead in the creation, cancellation and switching of processes", so the lightweight process technology of threads emerged. If you want to further reduce system resource overhead, people have come up with a way to prevent the thread that has completed all tasks from being destroyed, and let it be in a "standby" state waiting for new time-consuming operations to "come in". Qt provides two classes, QThreadPool and QRunnable, to reuse threads . The method used is generally to put the time-consuming operation into the run() function of the QRunnable class, and then throw the QRunnable object into the QThreadPool object as a whole. For details, refer to "Qt Multi-Threaded Programming: Reducing Thread Overhead".

In order to speed up writing code, we cannot necessarily use low-level classes like QThread in every scenario. If we encounter the "challenges encountered by multi-threaded data access" mentioned above, manually adding a mutex every time will inevitably increase our workload. Therefore, Qt provides the QtConcurrent module, which has many high-level functions for handling some common parallel computing modes. The biggest feature is that there is no need to use very low-level operations such as mutex locks, and they are all encapsulated . In addition, the QFuture, QFutureWatcher, and QFutureSynchronizer classes provide some auxiliary operations. For details, refer to "Qt Advanced Functions for Multi-Threaded Programming".

3.2 Solve problems encountered when accessing shared resources

At this point, we can easily open new threads to perform time-consuming operations. Next, we need to solve the core problems faced by multi-thread programming. There are two main ideas: (1) Since simultaneous access is not feasible, then we can Letting threads queue up for processing in an orderly manner is called "synchronized threads"; (2) Since the culprit causing this chaos is the interrupt mechanism, we can prevent it from being interrupted, which is called "atomic operation" .

Synchronizing threads is to allow multiple threads to process the same variable in an orderly manner, without rushing or crowding. Sometimes thread A needs to wait for thread B. The principle of forcing threads to wait for each other is called mutual exclusion. This is a common technique for protecting shared resources. QMutex is the basic class that provides mutual exclusion operations. It allows thread A to add a lock when accessing a global variable. If thread A has not finished executing, thread B cannot access this variable until thread A finishes processing. Then perform the unlocking operation. In addition, there are auxiliary classes such as QReadWriteLock, QSemaphore, and QWaitCondition to improve the efficiency of multi-threading. For details, refer to "Synchronizing Threads in Qt Multi-Threaded Programming".

Another way to solve "multi-threaded data access" is atomic operations. Atomic operations refer to operations that will not be interrupted by the thread scheduling mechanism. Once this operation starts, it will run until the end without any switching in the middle. Therefore, atomic operations do not require thread synchronization at all. This technology is very complex, so complex that there is no such operation in C language at all, because most atomic operations are implemented in assembly language. The powerful Qt provides classes for atomic operations, such as QAtomicInteger and QAtomicPointer classes. For details, refer to "Atomic Operations in Qt Multi-Threaded Programming".

3.3 Applicable scenarios for different thread classes

Although Qt provides three thread classes (refer to "3.1 Thread Classes" above), their applicable scenarios are different. In the following articles, I will explain them one by one, and finally I will summarize them in an article. For details, please refer to "Where should these Qt multi-threaded classes be used".

Guess you like

Origin blog.csdn.net/weiweiqiao/article/details/133531155