线程同步（Linux系统）

线程同步中的“同步”并不是指步调或者节奏一致，而是谁先执行、谁后执行的概念。CPU对多线程的调度，使得线程同步在多线程访问共享资源时尤为重要，线程同步也是多线程编程中的难点所在。

在Linux系统中线程同步的主要实现手段有：互斥锁、读写锁、条件变量、信号量。

1. 互斥锁

互斥锁是线程同步最常见的技术手段，它也被称为互斥量，实质上它就是一个pthread_mutex_t类型的变量。相关的使用函数如下：

(1)初始化互斥锁
int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *mutexattr);

(2)销毁互斥锁
int pthread_mutex_destroy(pthread_mutex_t *mutex);

(3)尝试加锁
int pthread_mutex_trylock(pthread_mutex_t *mutex);
mutex没被锁上时，当前线程会将mutex锁上，返回0
mutex已被锁上时，失败返回(返回错误号，用strerror()函数打印错误信息)，线程不阻塞。

(4)加锁
int pthread_mutex_lock(pthread_mutex_t *mutex);
mutex没被锁上时，当前线程会将mutex锁上
mutex已被锁上时，线程会一直阻塞在此位置，直到锁被(其它线程)解开才解除阻塞。

(5)解锁
int pthread_mutex_unlock(pthread_mutex_t *mutex);

需要注意：使用互斥锁实现线程同步时，所有的线程在访问共享资源前都需要加锁操作。代码实例：

#include <pthread.h>

//Defining a mutex 
pthread_mutex_t mutex;

void* fooA(void* n)
{
    pthread_mutex_lock(&mutex);     //lock

    //Operating shared resources 
    //...

    pthread_mutex_unlock(&mutex);   //unlock
    return NULL;
}

void* fooB(void* n)
{
    pthread_mutex_lock(&mutex);     //lock

    //Operating shared resources 
    //...

    pthread_mutex_unlock(&mutex);   //unlock
    return NULL;
}


int main(void)
{
    pthread_t p[2];

    //Initialization mutex 
    pthread_mutex_init(&mutex, NULL);

    pthread_create(&p[0], NULL, fooA, NULL);
    pthread_create(&p[1], NULL, fooB, NULL);

    pthread_join(p[0], NULL);
    pthread_join(p[1], NULL);

    //Destroy mutex 
    pthread_mutex_destroy(&mutex);

    return 0;
}

多线程执行理论上是并发执行的，但是加锁后访问共享资源时则降级为串行的，显然这对效率有一定的影响。
另外，使用互斥锁实现线程同步，最典型的软件BUG就是死锁。产生死锁现象后，相关的线程不退出也不工作。造成死锁的原因大致有：
(1)自己锁自己

void* fooA(void* n)
{
    pthread_mutex_lock(&mutex);     //lock
    pthread_mutex_lock(&mutex);     //lock again，造成死锁现象

    //Operating shared resources 
    //...

    pthread_mutex_unlock(&mutex);   //unlock
    return NULL;
}

类似的，当线程a执行函数上锁后没有解锁就退出，线程b又要企图上锁时同样造成死锁。

(2)互相等待对方解锁
这里写图片描述
如上图，翻译成代码实现为：

pthread_mutex_t mutexA;
pthread_mutex_t mutexB;

resources_t A;
resources_t B;

void* foo1(void* n)
{
    pthread_mutex_lock(&mutexA);    
    //Operating shared resources A
    //...

    pthread_mutex_lock(&mutexB);        //这里发生死锁
    //Operating shared resources B
    //...

    pthread_mutex_unlock(&mutexB);
    pthread_mutex_unlock(&mutexA);

    return NULL;
}

void* foo2(void* n)
{
    pthread_mutex_lock(&mutexB);    
    //Operating shared resources B
    //...

    pthread_mutex_lock(&mutexA);        //这里发生死锁
    //Operating shared resources A
    //...

    pthread_mutex_unlock(&mutexA);
    pthread_mutex_unlock(&mutexB);

    return NULL;
}

线程1对共享资源A加锁成功，获得A锁，线程2对共享资源B加锁成功，获得B锁；
线程1访问共享资源B，对B加锁，线程1阻塞在等待B锁上，线程2访问共享资源A，对A加锁，线程2阻塞在等待A锁上。

解决办法是：
(1)让线程按照一定的顺序访问共享资源；
(2)在访问其他锁之前需要将自己的锁解开；
(3)使用pthread_mutex_trylock()进行加锁；

2. 读写锁

读写锁指的是一把锁，即一个pthread_rwlock_t类型的变量。用读写锁实现线程同步，即在读共享资源前加上读锁，在写共享资源前加上写锁，读写完毕后在解锁。
读写锁的特性是①读写不能同时进行，②读共享、写独占，③写的优先级高。其特性可用下面3种应用场景体现：
(1)线程A加读锁成功后，又来了2个线程做读操作，它们加读锁成功。(读共享)
(2)线程A加写锁成功后，又来了2个线程做读操作，它们加读锁失败，线程阻塞。(写独占)
(3)线程A加读锁成功后，又来了B线程加写锁阻塞，再来C线程加读锁也阻塞。(读写不能同时进行且写的优先级高于读的优先级)

使用互斥锁时，不论是读共享资源还是写共享资源都需要对资源上锁，也就是都可能发生阻塞等待，它是读写都串行的；读写锁则不然，读操作并行，写操作是串行，显然读写锁在读操作远多于写操作的程序中效率会远高于互斥锁。

扫描二维码关注公众号，回复： 1579903 查看本文章

读写锁的操作函数如下：

(1)初始化读写锁
int pthread_rwlock_init(pthread_rwlock_t *restrict rwlock, const pthread_rwlockattr_t *restrict attr);
restrict是一个修饰符，以pthread_rwlock_t *restrict rwlock为例，restrict用于指定只能使用rwlock指向用户定义的读写锁变量，在pthread_rwlock_init()函数的实现体中将不可能有其它指针指向用户定义的读写锁变量(否则语法报错)。

(2)销毁读写锁
int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);

(3)加读锁
int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);
若之前对这把锁加写锁操作，则此函数会阻塞。

(4)尝试加读锁
int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);
加锁成功返回0，失败返回错误号。

(5)加写锁
int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);
若之前加写锁/写锁且还没解锁，此函数会阻塞。

(6)尝试加写锁
int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);

(7)解锁
int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);
不管是读/写锁，都使用此函数进行解锁。

示例代码：

void* write_func(void* dat)
{
    while (1)
    {
        pthread_rwlock_wrlock(&lock);
        //写共享资源
        //...
        pthread_rwlock_unlock(&lock);
    }

    return NULL;
}

void* read_func(void* dat)
{
    while (1)
    {
        pthread_rwlock_rdlock(&lock);
        //读共享资源
        //...
        pthread_rwlock_unlock(&lock);
    }
    return NULL;
}


int main(void)
{
    pthread_t p[8];

    //创建3个线程用于写共享资源
    for (int i = 0; i < 3; ++i)
        pthread_create(&p[0], NULL, write_func, NULL);

    //创建5条线程用于读共享资源
    for (int i = 3; i < 8; ++i)
    {
        pthread_create(&p[i], NULL, read_func, NULL);
    }

    //等待回收子线程的pcb资源
    for (int i = 0; i < 8; ++i)
        pthread_join(p[i], NULL);

    return 0;
}

3. 条件变量

在生产者-消费者模型中，生产者/消费者各是一个线程，共享资源是产品。以使用互斥量为例，实现伪代码如下：

//消费者线程
while (1)
{
    pthread_mutex_lock(&mutex);
    if (head != NULL)
    {
        //执行消费
    }
    pthread_mutex_unlock(&mutex);
    sleep(1);
}

//生产者线程
while (1)
{
    //创建节点
    pthread_mutex_lock(&mutex);
    //将节点插入链表中
    //...
    pthread_mutex_unlock(&mutex);
}

这种方法虽然可以实现生产者-消费者模型，但是消费者线程一直处于判断中，占据了大量的cpu资源，我们希望实现在没有产品的情况下，消费者在阻塞休眠，当生产者生产出产品时再通知消费者醒来消费，这就需要到条件变量。

//消费者
while (1)
{
    pthread_mutex_lock(&mutex);
    if (head == NULL)
    {
        //阻塞休眠
        pthread_cond_wait(&cond, mutex);    //这里还会将mutex解锁，即解锁后再休眠
                                            //被唤醒后又再加锁

    }   
    //执行消费
    //...
    pthread_mutex_unlock(&mutex);

    sleep(1);
}

//生产者
while (1)
{
    //创建节点
    pthread_mutex_lock(&mutex);
    //将节点插入链表中
    //...
    pthread_mutex_unlock(&mutex);

    //唤醒等待条件满足的线程
    pthread_cond_signal(&cond);
}

条件变量只是能够阻塞线程，它不是“锁”，所以并不能保护共享数据，所以如上伪代码，条件变量都是和互斥量配合使用。事实上，引起线程的阻塞操作的pthread_cond_wait()的参数二确实需要一个互斥量mutex。互斥量用于保护共享数据，条件变量用于引起线程阻塞。

条件变量的作用是：当条件不满足时，阻塞线程；当条件满足时，通知阻塞的线程开始工作。

条件变量的类型为pthread_cond_t，主要函数有：

(1)初始化一个条件变量
int pthread_cond_init(pthread_cond_t *cond, pthread_condattr_t *cond_attr);

(2)销毁一个条件变量
int pthread_cond_destroy(pthread_cond_t *cond);

(3)阻塞等待一个条件变量
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
从上面的伪代码可以看出，此函数在阻塞当前线程之前会先释放锁，而当前线程被唤醒后会上锁。

(4)唤醒至少一个阻塞在条件变量的线程
int pthread_cond_signal(pthread_cond_t *cond);

(5)唤醒所有阻塞在条件变量的线程
int pthread_cond_broadcast(pthread_cond_t *cond);

生产者-消费者完整示例代码：

typedef struct _node node_t;
struct _node
{
    int data;
    node_t *next;
};

node_t* head = NULL;
pthread_mutex_t mutex;
pthread_cond_t cond;

void* producer(void* h)
{
    while (1)
    {
        node_t* n = (node_t*)malloc(sizeof(node_t));
        n->data = rand() % 1000;

        pthread_mutex_lock(&mutex);     //访问共享资源前上锁
        n->next = head;
        head = n;
        printf("producer\t%lu\t%d\n", pthread_self(), n->data);
        pthread_mutex_unlock(&mutex);   //解锁

        pthread_cond_signal(&cond);     //唤醒等待在cond的线程
        sleep(rand() % 4);
    }

    return NULL;
}

void* customer(void* h)
{
    while (1)
    {
        pthread_mutex_lock(&mutex);     //访问共享资源前上锁
        if (head == NULL)
        {
            pthread_cond_wait(&cond, &mutex);   //解锁、休眠
                                                //被唤醒、上锁
        }

        node_t* pdel = head;
        head = head->next;
        printf("customer\t%lu\t%d\n", pthread_self(), pdel->data);
        free(pdel);
        pthread_mutex_unlock(&mutex);   //解锁
    }

    return NULL;
}

int main(void)
{
    pthread_t p[2];

    //初始化互斥锁和条件变量
    pthread_mutex_init(&mutex, NULL);
    pthread_cond_init(&cond, NULL);

    //创建读写线程
    pthread_create(&p[0], NULL, producer, NULL);
    pthread_create(&p[1], NULL, customer, NULL);

    //等待回收线程的pcb
    pthread_join(p[0], NULL);
    pthread_join(p[1], NULL);

    //销毁互斥锁和条件变量
    pthread_mutex_destroy(&mutex);
    pthread_cond_destroy(&cond);

    return 0;
}

4. 信号量

信号量分2种：无名信号量和有名信号量。无名信号量只用于线程间的同步，有名信号量一般只用于进程间通信。二者的共同点都是相当计数器，用于限制多个进程/线程对有限共享资源的访问。信号量和互斥量的区别在于：互斥量在任何时候都只允许一个线程访问共享资源，而信号量则允许最多value个线程同时访问共享资源，当value为1时，信号量和互斥量等价。所以可以说，信号量是加强版的互斥量。

因为这里总结的是线程同步，所以以无名信号量为例(下称“信号量”)。

信号量的类型为sem_t，需包含头文件semaphore.h。主要函数有：

(1)初始化信号量
int sem_init(sem_t *sem, int pshared, unsigned int value);  //pshared等于0表示无名信号量

(2)销毁信号量
int sem_destroy(sem_t *sem);

(3)加锁操作
int sem_wait(sem_t *sem);
相当于对sem执行--操作(--操作也被称之为P操作)。若信号量的值为0则线程阻塞。

(4)尝试加锁
int sem_trywait(sem_t *sem);
信号量的值为0则加锁失败返回错误号

(5)限时尝试加锁
int sem_timedwait(sem_t *sem, const struct timespec *abs_timeout);

(6)解锁
int sem_post(sem_t *sem);
对信号量执行++操作(++操作也被称之为V操作)

将信号量运用于生产者-消费者模型：生产者和消费者各自拥有一个信号量，初值分别为2和0，这样生产者线程运行后获取自身信号量可以立即生产，消费者线程运行后获取自身信号量则处于阻塞等待状态，当生产者生产产品完毕后对消费者信号量执行v操作，消费者得以消费，消费完毕后对生产者信号执行v操作使得生产者继续生产，如此循环：

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <semaphore.h>

typedef struct _node node_t;
struct _node
{
    int data;
    node_t *next;
};

node_t* head = NULL;

//定义生产者和消费者各自的信号量
sem_t producer_sem;
sem_t customer_sem;

void* producer(void* h)
{
    while (1)
    {
        //对producer_sem执行P操作
        sem_wait(&producer_sem);
        node_t* n = (node_t*)malloc(sizeof(node_t));
        n->data = rand() % 1000;

        n->next = head;
        head = n;
        printf("producer\t%lu\t%d\n", pthread_self(), n->data);

        //对customer_sem执行V操作，customer得以消费
        sem_post(&customer_sem);

        sleep(rand() % 3);
    }

    return NULL;
}

void* customer(void* h)
{
    while (1)
    {
        //对customer_sem执行P操作
        sem_wait(&customer_sem);

        node_t* pdel = head;
        head = head->next;
        printf("customer\t%lu\t%d\n", pthread_self(), pdel->data);
        free(pdel);

        //对producer_sem执行V操作，producer得以生产
        sem_post(&producer_sem);
    }

    return NULL;
}

int main(void)
{
    pthread_t p[2];

    //初始化信号量
    sem_init(&producer_sem, 0, 4);
    sem_init(&customer_sem, 0, 0);

    //创建读写线程
    pthread_create(&p[0], NULL, producer, NULL);
    pthread_create(&p[1], NULL, customer, NULL);

    //等待回收线程的pcb
    pthread_join(p[0], NULL);
    pthread_join(p[1], NULL);

    //销毁信号量
    sem_destroy(&producer_sem);
    sem_destroy(&customer_sem);

    return 0;
}