Multi-threaded programming notes

1. Thread concept

A thread, sometimes called a Lightweight Process (LWP), is the smallest unit of program execution flow. A standard thread is composed of thread ID, current instruction pointer (PC), register set and stack. In addition, the thread is an entity in the process and is the basic unit independently scheduled and dispatched by the system. The thread itself does not own system resources, but only has a few essential resources in operation, but it can be combined with other processes that belong to the same process. Threads share all the resources owned by the process. One thread can create and cancel another thread, and multiple threads in the same process can execute concurrently. Due to the mutual restriction between threads, the threads show discontinuity in operation. A thread also has three basic states: ready, blocked, and running. The ready state means that the thread has all the conditions for running, logically can run, and is waiting for the processor; the running state means that the thread occupies the processor is running; the blocking state means that the thread is waiting for an event (such as a semaphore), and the logic Not executable on. Every program has at least one thread. If the program has only one thread, it is the program itself.

A thread is a single sequential control flow in a program. A relatively independent and schedulable execution unit in a process is the basic unit of the system for independent scheduling and dispatch of the CPU, which refers to the scheduling unit of the running program. Running multiple threads at the same time in a single program to complete different tasks is called multithreading.

2. Create a thread

It returns 0 on success, and returns a positive number that is the value of errno on failure.

3. Terminate the thread

The pthread_exit function is similar to calling return in the start function. The difference is that the pthread_exit function is called in any function called by the start function. If the main thread calls the pthread_exit function instead of calling the exit function or executing return, then other threads will run normally. The parameter retval is used to set the return value when the thread exits.

By default, pthread_cancel only informs the thread. As for when the thread will be cancelled, it will stop only when it encounters a cancellation point (think of it as a function).

Cancellation can be prohibited in the thread function. The following function can achieve this function:

int pthread_setcancelstate(int state, int *oldstate);

Among them, state can be the following values:

PTHREAD_CANCEL_ENABLE
PTHREAD_CANCEL_DISABLE

By default, the value of state is PTHREAD_CANCEL_ENABLE, which means it can be cancelled. If the state is set to PTHREAD_CANCEL_DISABLE, the thread cannot be cancelled.

There is a way to make the thread exit directly before reaching the cancellation point. At this time, you need to set the thread to be cancelled asynchronously. The specific method is to call the following function inside the thread:

int pthread_setcanceltype(int type, int *oldtype);

Among them, the value of type is as follows:

PTHREAD_CANCEL_ASYNCHRONOUS
PTHREAD_CANCEL_DEFERRED

By default, the thread cancellation method is the default value-PTHREAD_CANCEL_DEFERRED. If you want the thread to exit immediately after receiving the "cancel signal", you need to set the type to PTHREAD_CANCEL_ASYNCHRONOUS.

4. Thread ID

Each thread within the process has a unique identifier called the thread ID. The thread ID will be returned to the caller of pthread_create, and a thread can obtain its own thread ID through the pthread_self() function.

The type of thread ID is pthread_t. Linux defines pthread_t as an unsigned long, but in other implementations, it may be a pointer or structure. When you need to compare whether two thread IDs are the same, the function pthread_equal is used.

int pthread_equal(pthread_t t1, pthread_t t2)

Return a non-zero value, which means equal.

As shown in the figure above, the SPID we see is not the thread ID we mentioned above. The thread ID is allocated and maintained by the thread library implementation. The SPID is allocated by the kernel, similar to the process ID, which can be obtained by the system call gettid().

5. Connecting (joining) terminated threads

int pthread_join(pthread_t thread, void **retval)

The function pthread_join() waits for the thread identified by thread to terminate. (If the thread has terminated, pthread_join() will return immediately). This operation is called joining (joining). If retval is a non-null pointer, a copy of the return value when the thread terminates will be saved. The return value is the value specified when the thread returns or pthread_exit().

If the thread is not detached (pthread_detach()), you must use pthread_join() to connect. If it fails to connect, a zombie thread will be generated when the thread terminates. Similar to the concept of a zombie process, apart from wasting system resources, if the zombie thread accumulates too much, the application will no longer be able to create a new thread.

pthread_join() is similar to the waitpid() call for a process, but the relationship between threads is equal, and any thread in the process can call pthread_join to connect with any other thread in the process.

6, thread separation

int pthread_detach(pthread_t thread)

If you don't care about the return status of the thread, you just want the system to automatically clean up and remove it when the thread terminates. You can call pthread_detach() to mark the thread as detached. Once the thread is in the detached state, it can no longer use pthread_join() to get its state, nor can it return to the "connectable" state. When other threads call exit(), or the main thread executes the return statement, all threads of the process will be terminated no matter whether the thread is in the connected state or the detached state.

7. Thread cleanup function

void pthread_cleanup_push(void (*rtn)(void*), void *arg);
void pthread_cleanup_pop(int execute);

In Linux, the two functions pthread_cleanup_push and pthread_cleanup_pop are done through macros. pthread_cleanup_push is replaced with a piece of code starting with a left curly brace {, and pthread_cleanup_pop is replaced with a piece of code ending with a right curly brace}, which This means that these two functions must appear in pairs to match the left and right curly braces, otherwise a compilation error will occur.

There are three situations where the thread cleanup function will be called:

The thread is cancelled by pthread_cancel before executing pthread_cleanup_pop
The thread actively executes pthread_exit to terminate before executing pthread_cleanup_pop
The thread executes pthread_cleanup_pop, and the parameter of pthread_cleanup_pop is not 0.

Note: If the thread returns through return before executing pthread_cleanup_pop, the cleanup function will not be executed.
Refer to https://blog.csdn.net/q1007729991/article/details/60751394

7, the attributes of the mutex

The mutex used for thread mutual exclusion also has the corresponding attribute pthread_mutexattr_t. Only three aspects are discussed here:

Shared attributes
Robust properties
Recursive types of mutexes

The data type of the mutex attribute is pthread_mutexattr_t. The following two functions are used to initialize and recover the mutex attribute.

int pthread_mutexattr_init(pthread_mutexattr_t *attr);
int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);

1. Shared attributes

In addition to the shared attributes of mutexes, other thread mutually exclusive synchronization objects such as read-write locks, spin locks, condition variables, and barriers have shared attributes.

There are two situations for this attribute:

PTHREAD_PROCESS_PRIVATE : This is the default situation, which means that the mutex can only be used inside the process.
PTHREAD_PROCESS_SHARED: Indicates that the mutex can be used between different processes.

Shared settings and get functions

int pthread_mutexattr_setpshared(pthread_mutexattr_t *mattr, int pshared);
int pthread_mutexattr_getpshared(pthread_mutexattr_t *mattr, int pshared);

2. Robust properties

If one of the processes hangs up without releasing the mutex, it will cause the other thread to never acquire the lock, and then deadlock. In order to allow the process to release the mutex lock when it terminates abnormally, the ROBUST attribute needs to be specified. The so-called ROBUST means robust.

int pthread_mutexattr_getrobust(const pthread_mutexattr_t *attr, int *restrict robust);
int pthread_mutexattr_setrobust(pthread_mutexattr_t *attr, int robust);

This attribute has two values:

PTHREAD_MUTEX_STALLED: This value is the default value, which means that no special action is required when the process holding the mutex terminates. In this case, if the lock is not released, other processes or threads still cannot acquire the lock
PTHREAD_MUTEX_ROBUST: In this case, if the lock is not released, other processes or threads will return the value of EOWNERDEAD when the lock is locked.

In the case of specifying the robust attribute, if one of the processes exits without releasing the lock, another process can still acquire the lock, but at this time pthread_mutex_lock will return EOWNERDEAD to notify the thread that acquired the lock that there is another process The thread hangs, and the mutex is now inconsistent. At this time, you need to do consistent processing on the mutex, otherwise, once the mutex is unlocked again, the mutex will be permanently unavailable.
Translated into code is like this:

if (EOWNERDEAD == pthread_mutex_lock(&lock)) {
  pthread_mutex_consistent(&lock);
}

3. The recursive type of mutex

There are usually four types of mutexes:

PTHREAD_MUTEX_NORMAL
PTHREAD_MUTEX_ERRORCHECK
PTHREAD_MUTEX_RECURSIVE
PTHREAD_MUTEX_DEFAULT

You can use the following functions to set and get the type attributes of the mutex:

int pthread_mutexattr_gettype(const pthread_mutexattr_t *attr, int *type);
int pthread_mutexattr_settype(pthread_mutexattr_t *attr, int type);

Under normal circumstances, if we lock the same mutex twice in the same thread, it will deadlock. This problem does not occur if the mutex type attribute is set to the recursive type PTHREAD_MUTEX_RECURSIVE.

The recursive mutex maintains a counter internally. When the mutex is unlocked, the counter value is 0. Only when the counter is 0, the thread can obtain the lock. Only the thread that obtains the lock can continue to lock the mutex. Each time the lock is added, the value of the counter is increased by 1, and each time the lock is unlocked, the value of the counter is decreased by 1.

8、pthread once

int phtread_once(pthread_once_t *initflag, void (*initfn)(void));

The initflag parameter is a global object of type pthread_once_t, which is initialized to PTHREAD_ONCE_INIT
initfn is an initialization function, which can be called in multiple threads but only executed once.

9, thread private variables

The system provides a private "container" for each thread , in which a key-value pair is stored. You can get the value through the key, and you can also save the value to this container. In pthread, both the key type and the value type are required.

The key type is pthread_key_t, which can only be initialized through the function pthread_key_create. It is defined as follows:

int pthread_key_create(pthread_key_t *key, void (*destructor(void*));

The first parameter of the function, the key can be used by any thread.

The second parameter of the function, it needs to pass a destructor. When the thread ends (return or pthread_exit), this function will be called automatically. The parameter of the destructor is the value associated with the key.

Through the pthread_key_delete function, the specified key can be deleted, but the destructor will not be called:

int pthread_key_delete(pthread_key_t key);

The value type in the thread container must be void*. You can save the key-value pair in the thread's own container through the following function.

int pthread_setspecific(pthread_key_t key, const void *value);

When different threads call the above function, they will only save the key-value pairs in their own (referring to the thread's own) container. Therefore, we call this key-value pair thread private data.

You can get the corresponding value based on the key through the following function:

void* pthread_getspecific(pthread_key_t key);

Note that the obtained value type is still void* type.

10. errno variable and multithreading

The errno variable has always been used as an integer variable that is assigned an error code after an error occurs when calling system functions. The errno variable is thread-safe.

In essence, errno is not a variable in the true sense, but expanded into a statement through a macro definition, and this line of statement is actually calling a function, and the function returns and saves a pointer to the errno variable. Implementation of errno in bits/errno.h:

Write picture description here

We can also implement errno through thread private variables and pthread_once:

// myerrno.c
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>

// 实际上 myerrno 就是一个宏定义
#define myerrno (*_myerrno())

pthread_key_t key;
pthread_once_t init_done = PTHREAD_ONCE_INIT;

// 使用 pthread once 对键进行初始化
void thread_init() {
  puts("I'm thread_init");
  pthread_key_create(&key, free); // 这里注册了析构函数就是 free
}

// 该函数用来获取真正的 myerrno 的地址
int *_myerrno() {
  int *p; 
  pthread_once(&init_done, thread_init);
  // 如果根据键拿到的是一个空地址，说明之前还未分配内存
  p = (int*) pthread_getspecific(key);
  if (p == NULL) {
    p = (int*)malloc(sizeof(int));
    pthread_setspecific(key, (void*)p);
  }
  /**************************************/
  return p;
}

void* fun1() {
  errno = 5;
  myerrno = 5; // 这一行被扩展成 (*_myerrno()) = 5
  sleep(1);
  // printf 后面的 myerrno 会被扩展成 (*_myerrno())
  printf("fun1: errno = %d, myerrno = %d\n", errno, myerrno);
  return NULL;
}

void* fun2() {
  errno = 10; 
  myerrno = 10;  // 这一行被扩展成 (*_myerrno()) = 10
  printf("fun2: errno = %d, myerrno = %d\n", errno, myerrno);
  return NULL;
}

int main() {
  pthread_t tid1, tid2;
  pthread_create(&tid1, NULL, fun1, NULL);
  pthread_create(&tid2, NULL, fun2, NULL);
  pthread_join(tid1, NULL);
  pthread_join(tid2, NULL);
  return 0;
}

11. Signals and multithreading

In multithreading, each thread has its own set of blocked signals and pending signals. When a thread derives another thread, it will inherit the blocking signal set of the parent thread, but will not inherit the pending signal set, and the new thread will clear the pending signal set.

In a multithreaded program, if you want to set the blocking signal set of the thread, you can no longer use the sigprocmask function, but should use pthread_sigmask, which is defined as follows:

int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset);

The usage of this function is the same as sigprocmask.

how parameter

SIG_BLOCK This option means to add the signal in the signal set indicated by the set parameter to the process blocking set.
SIG_UNBLOCK This option is the opposite of the function SIG_BLOCK, which means to delete the signal specified in
the thread blocking signal set. SIG_SETMASK This option means to directly set the thread blocking signal set to you Specified set
set parameters

Represents the set of signals you specify

oldset
returns the old blocking signal set

The return value int
0 means success, -1 means failure.

Function to get pending signals

int sigpending(sigset_t *set);

The kill function can only send functions to the specified process, but use pthread_kill to send functions to the specified thread. Its prototype is as follows:

int pthread_kill(pthread_t thread, int sig);

If sigwait is called in a thread, it will wait for the signal it specifies until the specified signal appears in the pending signal set. At the same time, sigwait will also take the signal from the pending signal set and return, and make the signal never a signal. Centrally delete. If sigwait is called in multiple threads to wait for the same signal, only one thread can return from sigwait.

int sigwait(const sigset_t *set, int *sig);

The parameter set indicates which signals to wait for. Once the sigwait function returns, the signals will be taken out of pending signals and placed in the memory pointed to by the parameter sig.

12. Multi-process and multi-thread

Using fork in a multithreaded program may cause some accidents:

There is only one thread left in the child process, which is a copy of the thread that calls fork in the parent process. This means that in a multi-threaded environment, it will cause "thread evaporation" and inexplicably missing!
Because the threads evaporate, the locks they hold may not be released, which will cause the child process to enter a deadlock when acquiring the lock.

Solution:

We now acquire the lock before fork(), and then release the lock after the fork is executed. The code is similar to this

pthread_mutex_lock(&lock);
pid = fork();
pthread_mutex_unlock(&lock);

2. pthread_atfork function

int pthread_atfork(void (*prepare)(void), void (*parent)(void), void (*child)(void));

The above three callback functions are prepare function, parent function, and child function. The timing of these functions is as follows:

The prepare function is called when the fork will spawn a child process
parent and child are called after the child process is spawned
parent is called in the parent process, child is called in the child process

Use pseudo code to illustrate the timing of the call:

prepare();
pid = fork();

if (pid > 0) {
  parent();
}
else if (pid == 0) {
  child();
}

Reference https://blog.csdn.net/q1007729991