[Operating Systems: Three Easy Pieces Introduction to Operating Systems] Concurrency
"Operating Systems: Three Easy Pieces" Chapter 26, Concurrency: Introduction
1 thread: A program has only one execution point (a program counter, used to store instructions to be executed), and a multi-threaded program will have multiple execution points (multiple program counters, each used to fetch and execute instructions). Each thread has a program counter (which keeps track of where to fetch instructions), and each thread has its own set of registers used for computation. Switching between threads is similar to context switching between processes. For processes, the state is saved in the PCB, and for threads, it is saved in the thread control block (TCB), but the address space remains unchanged.
4. Atomic mode: Updates will be performed as expected, and cannot be interrupted in the middle of the instruction, but generally one instruction cannot do this, so some general collections, namely synchronization primitives, and the help of the operating system must be used to access the critical section in a synchronized and controlled manner (for example, include some mutually exclusive primitives to ensure that only one thread enters the critical section)
The state of an individual thread is very similar to the state of a process. Threads have a 程序计数器
(PC) that records where the program gets instructions from. 每个线程
has its own 一组
for calculations 寄存器
. So, if there are two threads running on one processor, a context switch 切换
must occur when running from one thread (T1) to another thread (T2) . 上下文切换
Context switching between threads is similar to context switching between processes. For processes, we save the state to the Process Control Block (PCB). Now, we need an OR 多个线程控制块
(Thread Control Block, TCB), which saves the state of each thread. 上下文切换
However, there is one major difference between threads compared to processes : 地址空间保持不变
(i.e. 不需要切换
currently used 页表
).
In a multi-threaded process, each thread runs independently and can of course call various routines to complete whatever work it is doing. Instead of having only one stack in the address space, each thread has a stack.
Example: thread creation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #include <stdio.h> #include <assert.h> #include <pthread.h> void *mythread(void *arg) { printf("%s\n", (char *) arg); return NULL; } int main(int argc, char *argv[]) { pthread_t p1, p2; int rc; printf("main: begin\n"); rc = pthread_create(&p1, NULL, mythread, "A"); assert(rc == 0); rc = pthread_create(&p2, NULL, mythread, "B"); assert(rc == 0); // join waits for the threads to finish rc = pthread_join(p1, NULL); assert(rc == 0); rc = pthread_join(p2, NULL); assert(rc == 0); printf("main: end\n"); return 0; }
Why It's Worse: Sharing Data
Two threads increment the same number, and the final result of each run is different
The reason is that shared data does not guarantee operation原子性
The latter are all conceptual things, so I won’t repeat them
"Operating Systems: Three Easy Pieces" Chapter 27: Thread API
Introduction to the pthread library
thread creation
#include <pthread.h>
int
pthread_create(
pthread_t * thread,
const pthread_attr_t * attr,
void * (*start_routine)(void*),
void * arg
);
-
thread
Pointing to
pthread_t
the structure type指针
, we will use this structure to interact with this thread, so we need to pass it inpthread_create()
to initialize it. Equivalent to the ID of the thread -
attr
Specifies any that this thread may have
属性
. Include settings栈大小
, or调度优先级
information about the thread, etc. -
*start_routine
A function pointer (function pointer), pointing to the function to be run
-
arg
arguments to the function to run
thread complete
By pthread_join
blocking waiting for the thread to complete
pthread_create(&p, NULL, mythread, (void *) 100);
pthread_join(p, (void **) &m); // 第一个是 pthread_t 类型,用于指定要等待的线程
// 第二个参数是一个指针,指向你希望得到的返回值。
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include "common.h"
#include "common_threads.h"
void *mythread(void *arg) {
printf("%s\n", (char *) arg);
return NULL;
}
int main(int argc, char *argv[]) {
if (argc != 1) {
fprintf(stderr, "usage: main\n");
exit(1);
}
pthread_t p1, p2;
printf("main: begin\n");
Pthread_create(&p1, NULL, mythread, "A");
Pthread_create(&p2, NULL, mythread, "B");
// join waits for the threads to finish
Pthread_join(p1, NULL); // 等待进程p2 并且初始化为 NULL
Pthread_join(p2, NULL);
printf("main: end\n");
return 0;
}
Lock
int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);
//对于 POSIX 线程,有两种方法来初始化锁
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; // 初始化 1
// 或是int rc = pthread_mutex_init(&lock, NULL); // 初始化 2 常用 !
pthread_mutex_lock(&lock);
x = x + 1; // or whatever your critical section is
pthread_mutex_unlock(&lock);
Create临界区
If another thread does hold the lock, the thread attempting to acquire the lock will not return from the call( 阻塞等待
) until the lock is acquired
int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_timedlock(pthread_mutex_t *mutex,
struct timespec *abs_timeout);
These two calls are used to acquire the lock ( 非阻塞
acquire lock). If the lock is already taken, trylock
the version will 失败
. The specific version that acquired the lock timedlock
will be 超时或获取锁后
returned, whichever happens first. Usually 避免使用
both versions should be
Condition Variables
Different from semaphore, semaphore should be a combination of condition variable + mutex, see this article
Condition variables are useful when some kind of signaling has to happen between threads, if one thread 等待
continues 执行
something in another thread.
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal(pthread_cond_t *cond);
To use a condition variable, there must be another one with this condition 相关的锁
. This lock should be held when calling any of the above functions.
The first function pthread_cond_wait()
puts the calling thread into 休眠
a state so it waits for other threads to signal it, usually when something in the program changes, the thread that is now sleeping might care about it. Typical usage looks like this:
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
pthread_mutex_lock(&lock);
while (ready == 0)
pthread_cond_wait(&cond, &lock);
pthread_mutex_unlock(&lock);
唤醒线程
The code runs in another thread and pthread_cond_signal
needs to be held when calling 对应锁
. Like this:
pthread_mutex_lock(&lock);
ready = 1;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
pthread_cond_wait
There is a second parameter because it will 隐式
be 释放锁
so that a thread that wakes up after its thread sleeps can acquire the lock, and then it will again 重新获得锁
.
In this example, while
judge the value of is changed by judging ready
instead of waking up the condition variable to judge ready
that has been changed. It's safer to think of arousal as something that might have been 发生变化
, 暗示
rather than an absolute fact
compile and run
Code needs to include header files pthread.h
in order to compile. When linking requires pthread
a library, add -pthread
a tag.
prompt> gcc -o main main.c -Wall -pthread
summary
Supplement: Thread API Guide
When you're building multithreaded programs using the POSIX threading library (or, indeed, any threading library), there are a few small but important things to keep in mind:
- Keep it simple . Most importantly, the code for locks and signals between threads should be as concise as possible. Complex thread interactions are prone to bugs.
- Keep thread interactions to a minimum . Minimize interaction between threads. Every interaction should be thought out clearly and implemented with verified and correct methods (many methods will be learned in subsequent chapters).
- Initialize locks and condition variables . Uninitialized code sometimes works fine and sometimes fails with weird results.
- Check the return value . Of course, any C and UNIX program should check the return value, and it's the same here. Failure to do so can result in odd and incomprehensible behavior, making you scream, or pull your hair out in pain.
- Pay attention to the parameters and return values passed to the thread . Specifically, if you pass a reference to a variable allocated on the stack, you may be making a mistake.
- Each thread has its own stack . Similar to the previous one, remember that each thread has its own stack. Therefore, thread-local variables should be thread-private and should not be accessed by other threads. To share data between threads, the values should be in the heap or other globally accessible locations.
- Signals between threads are always sent through condition variables . Remember not to use tag variables for synchronization.
- Check the manual more . Especially the Linux pthread manual has more details and richer content. Please read carefully!