[Original] Analysis of the principle of Linux RCU (1)

background

  • Read the fucking source code! --By Lu Xun
  • A picture is worth a thousand words. --By Gorky

Description:

  1. Kernel version: 4.14
  2. ARM64 processor, Contex-A53, dual-core
  3. Use tools: Source Insight 3.5, Visio

1 Overview

RCU, Read-Copy-Update, Is a synchronization mechanism in the Linux kernel.
RCUIt is often described as a replacement for read-write locks. Its characteristic is that the reader does not need to synchronize directly with the writer, and the reader and the writer can also execute concurrently. RCUThe goal is to minimize the overhead on the reader side, so it is also commonly used in scenarios that require high reader performance.

  • advantage:

    1. The reader side has very little overhead, no need to acquire any locks, no need to execute atomic instructions or memory barriers;
    2. No deadlock problem;
    3. No problem of priority inversion;
    4. No danger of memory leaks;
    5. Very good real-time delay;
  • Disadvantages:

    1. The synchronization overhead of the writer is relatively large, and the writers need to be mutually exclusive.
    2. It is more complicated to use than other synchronization mechanisms;

Let's take a picture to describe the general operation:

  • Multiple readers can access critical resources concurrently and use rcu_read_lock/rcu_read_unlockthem simultaneously to calibrate critical sections;
  • The writer ( updater) copies a copy as a basis for modification when updating critical resources. When all readers leave the critical section, they point the pointer to the old critical resource to the updated copy and recycle the old resource;
  • Only one writer is shown in the figure. When there are multiple writers, mutual exclusion is required between writers;

The above description is relatively simple, and the implementation of RCU is very complicated. This article first gives a first impression of RCU, and analyzes the example with the interface. The subsequent articles will go deeper into the underlying implementation principle. let's start!

2. RCU basics

2.1 Basic elements of RCU

RCUThe basic idea is to divide the update Updateoperation into two parts: 1) Removalremove; 2) Reclamationrecycle.
To put it bluntly, the critical resource is read by multiple readers. When the writer updates the copy after modification, the first step needs to remove the old critical resource data (the modification pointer points), and the second step requires Recycle old data (for example kfree).

Therefore, it is functionally divided into the following three basic elements: Reader/Updater/ReclaimerThe interaction between the three is as follows:

  1. Reader

    • Use rcu_read_lockand rcu_read_unlockto define the critical area of ​​the reader. When accessing the RCUprotected data, you must always access it in the critical area;
    • Before accessing the protected data, you need to use rcu_dereferenceto get the RCU-protectedpointer;
    • When using non-preemptible, RCUyou rcu_read_lock/rcu_read_unlockcan not use the code that can sleep;
  2. Updater

    • When multiple Updaters update data, they need to use a mutual exclusion mechanism for protection;
    • Updater is used rcu_assign_pointerto remove the old pointer to point to the updated critical resources;
    • Updater uses synchronize_rcuor call_rcuto start Reclaimer, to recycle the old critical resources, which synchronize_rcumeans synchronously waiting for recycling, which call_rcumeans asynchronous recycling;
  3. Reclaimer

    • Reclaimer recycles old critical resources;
    • In order to ensure that no readers are accessing the critical resources to be recovered, Reclaimer needs to wait for all readers to exit the critical section. This waiting time is called the grace period ( Grace Period);

2.2 Three basic mechanisms of RCU

Used to provide the functions described above, RCUbased on three mechanisms.

2.2.1 Publish-Subscribe Mechanism

What is the concept of the subscription mechanism, come to the picture:

  • UpdaterAnd the Readerlike Publisherand Subsriberrelations;
  • UpdaterAfter updating the content, call the interface to publish, and Readercall the interface to read the published content;

So what needs to be done to ensure this subscription mechanism? Let's look at a pseudo code:

 /* Definiton of global structure */
 1 struct foo {
  2   int a;
  3   int b;
  4   int c;
  5 };
  6 struct foo *gp = NULL;
  7 
  8 /* . . . */
  9 /* =========Updater======== */ 
 10 p = kmalloc(sizeof(*p), GFP_KERNEL);
 11 p->a = 1;
 12 p->b = 2;
 13 p->c = 3;
 14 gp = p;
 15 
 16 /* =========Reader======== */
 17 p = gp;
 18 if (p != NULL) {
 19   do_something_with(p->a, p->b, p->c);
 20 }

At first glance, it seems that the problem is not too big. The Updater performs assignment and update, and the Reader performs reading and other processing. However, due to the problems of out-of-order compilation and execution, the order of execution of the above code may not necessarily be the order of code. For example, in some architectures ( DEC Alpha), the reader's operation part may be operated before p is assigned do_something_with().

To solve this problem, Linux offers rcu_assign_pointer/rcu_dereferencemacro to ensure the order of execution, Linux kernel is also based on rcu_assign_pointer/rcu_dereferencemacros a higher level packages, such as list, hlisttherefore, there are three protected RCU scene kernel: 1) pointer; 2) list the list ; 3) hlist hash linked list.

For these three scenarios, the Publish-Subscribeinterface is as follows:

2.2.2 Wait For Pre-Existing RCU Readers to Complete

Reclaimer needs to recycle the old critical resources, so the question comes, when will it happen? Therefore, it is RCUnecessary to provide a mechanism to ensure that all previous RCU readers have been completed, that is rcu_read_lock/rcu_read_unlock, they can only be recycled after exiting the calibrated critical section.

  • The Readers and Updater in the figure are executed concurrently;
  • When the Updater performs the Removaloperation, it is called synchronize_rcu, marking the end of the update and starting to enter the recovery phase;
  • After the synchronize_rcucall, there may be new readers to read critical resources (updated content) at this time, but the readers who are Grace Periodonly waiting Pre-Existingare in the figure Reader-4, Reader-5. As long as these RCU readers who existed before exited the critical section, it means the end of the grace period, so the recycling process is carried out;
  • synchronize_rcuIt is not that the last Pre-ExistingRCU reader leaves immediately after leaving the critical section, it may have a scheduling delay;

2.2.3 Maintain Multiple Versions of Recently Updated Objects

It 2.2.2节can be seen that after the Updater updates, before the Reclaimer recycles, there will be two new and old versions of the critical resources. Only after synchronize_rcureturning, the Reclaimer recycles the old critical resources, and the last version remains. Obviously, when there are multiple Updaters, there will be more critical resource versions.

Let's take a picture, taking pointers and linked lists as examples:

  • The synchronize_rcustart of the call is a critical point, maintaining different versions of critical resources;
  • After Reclaimer reclaims the old version of resources, it is finally unified;

3. RCU example analysis

It's time for a wave fucking sample code.

  • The overall code logic:
    1. Construct four kernel threads, two kernel threads to test the RCU protection operation of the pointer, and two kernel threads to test the RCU protection operation of the linked list;
    2. At the time of recycling, two mechanisms of synchronize_rcusynchronous recycling and call_rcuasynchronous recycling were used ;
    3. In order to simplify the code, the basic fault tolerance judgment has been omitted;
    4. The mechanism of multiple Updaters is not considered, therefore, the mutually exclusive operation between Updaters is also omitted;
#include <linux/module.h>
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/kthread.h>
#include <linux/rcupdate.h>
#include <linux/delay.h>

struct foo {
	int a;
	int b;
	int c;
	struct rcu_head rcu;
	struct list_head list;
};

static struct foo *g_pfoo = NULL;

LIST_HEAD(g_rcu_list);

struct task_struct *rcu_reader_t;
struct task_struct *rcu_updater_t;
struct task_struct *rcu_reader_list_t;
struct task_struct *rcu_updater_list_t;

/* 指针的Reader操作 */
static int rcu_reader(void *data)
{
	struct foo *p = NULL;
	int cnt = 100;

	while (cnt--) {
		msleep(100);
		rcu_read_lock();
		p = rcu_dereference(g_pfoo);
		pr_info("%s: a = %d, b = %d, c = %d\n",
				__func__, p->a, p->b, p->c);
		rcu_read_unlock();
	}

	return 0;
}

/*  回收处理操作 */
static void rcu_reclaimer(struct rcu_head *rh)
{
	struct foo *p = container_of(rh, struct foo, rcu);
	pr_info("%s: a = %d, b = %d, c = %d\n",
			__func__, p->a, p->b, p->c);
	kfree(p);
}

/* 指针的Updater操作 */
static int rcu_updater(void *data)
{
	int value = 1;
	int cnt = 100;

	while (cnt--) {
		struct foo *old;
		struct foo *new = (struct foo *)kzalloc(sizeof(struct foo), GFP_KERNEL);

		msleep(200);

		old = g_pfoo;

		*new = *g_pfoo;
		new->a = value;
		new->b = value + 1;
		new->c = value + 2;
		rcu_assign_pointer(g_pfoo, new);

		pr_info("%s: a = %d, b = %d, c = %d\n",
				__func__, new->a, new->b, new->c);

		call_rcu(&old->rcu, rcu_reclaimer);

		value++;
	}

	return 0;
}

/* 链表的Reader操作 */
static int rcu_reader_list(void *data)
{
	struct foo *p = NULL;
	int cnt = 100;

	while (cnt--) {
		msleep(100);
		rcu_read_lock();
		list_for_each_entry_rcu(p, &g_rcu_list, list) {
			pr_info("%s: a = %d, b = %d, c = %d\n",
					__func__, p->a, p->b, p->c);
		}
		rcu_read_unlock();
	}

	return 0;
}

/* 链表的Updater操作 */
static int rcu_updater_list(void *data)
{
	int cnt = 100;
	int value = 1000;

	while (cnt--) {
		msleep(100);
		struct foo *p = list_first_or_null_rcu(&g_rcu_list, struct foo, list);
		struct foo *q = (struct foo *)kzalloc(sizeof(struct foo), GFP_KERNEL);

		*q = *p;
		q->a = value;
		q->b = value + 1;
		q->c = value + 2;

		list_replace_rcu(&p->list, &q->list);

		pr_info("%s: a = %d, b = %d, c = %d\n",
				__func__, q->a, q->b, q->c);

		synchronize_rcu();
		kfree(p);

		value++; 
	}

	return 0;
}

/* module初始化 */
static int rcu_test_init(void)
{
	struct foo *p;

	rcu_reader_t = kthread_run(rcu_reader, NULL, "rcu_reader");
	rcu_updater_t = kthread_run(rcu_updater, NULL, "rcu_updater");
	rcu_reader_list_t = kthread_run(rcu_reader_list, NULL, "rcu_reader_list");
	rcu_updater_list_t = kthread_run(rcu_updater_list, NULL, "rcu_updater_list");

	g_pfoo = (struct foo *)kzalloc(sizeof(struct foo), GFP_KERNEL);

	p = (struct foo *)kzalloc(sizeof(struct foo), GFP_KERNEL);
	list_add_rcu(&p->list, &g_rcu_list);

	return 0;
}

/* module清理工作 */
static void rcu_test_exit(void)
{
	kfree(g_pfoo);
	kfree(list_first_or_null_rcu(&g_rcu_list, struct foo, list));

	kthread_stop(rcu_reader_t);
	kthread_stop(rcu_updater_t);
	kthread_stop(rcu_reader_list_t);
	kthread_stop(rcu_updater_list_t);
}

module_init(rcu_test_init);
module_exit(rcu_test_exit);

MODULE_AUTHOR("Loyen");
MODULE_LICENSE("GPL");

In order to prove that there is no deception, the output log running on the development board is posted, as shown below:

4. API introduction

4.1 Core API

The following interfaces cannot be more core.

a.      rcu_read_lock()  //标记读者临界区的开始
b.      rcu_read_unlock()  //标记读者临界区的结束
c.      synchronize_rcu() / call_rcu() //等待Grace period结束后进行资源回收
d.      rcu_assign_pointer()  //Updater使用这个宏对受RCU保护的指针进行赋值
e.      rcu_dereference()  //Reader使用这个宏来获取受RCU保护的指针

4.2 Other related APIs

Based on the core API, other related APIs have been extended, as follows, no more details:

RCU list traversal::

        list_entry_rcu
        list_entry_lockless
        list_first_entry_rcu
        list_next_rcu
        list_for_each_entry_rcu
        list_for_each_entry_continue_rcu
        list_for_each_entry_from_rcu
        list_first_or_null_rcu
        list_next_or_null_rcu
        hlist_first_rcu
        hlist_next_rcu
        hlist_pprev_rcu
        hlist_for_each_entry_rcu
        hlist_for_each_entry_rcu_bh
        hlist_for_each_entry_from_rcu
        hlist_for_each_entry_continue_rcu
        hlist_for_each_entry_continue_rcu_bh
        hlist_nulls_first_rcu
        hlist_nulls_for_each_entry_rcu
        hlist_bl_first_rcu
        hlist_bl_for_each_entry_rcu

RCU pointer/list update::

        rcu_assign_pointer
        list_add_rcu
        list_add_tail_rcu
        list_del_rcu
        list_replace_rcu
        hlist_add_behind_rcu
        hlist_add_before_rcu
        hlist_add_head_rcu
        hlist_add_tail_rcu
        hlist_del_rcu
        hlist_del_init_rcu
        hlist_replace_rcu
        list_splice_init_rcu
        list_splice_tail_init_rcu
        hlist_nulls_del_init_rcu
        hlist_nulls_del_rcu
        hlist_nulls_add_head_rcu
        hlist_bl_add_head_rcu
        hlist_bl_del_init_rcu
        hlist_bl_del_rcu
        hlist_bl_set_first_rcu

RCU::

        Critical sections       Grace period            Barrier

        rcu_read_lock           synchronize_net         rcu_barrier
        rcu_read_unlock         synchronize_rcu
        rcu_dereference         synchronize_rcu_expedited
        rcu_read_lock_held      call_rcu
        rcu_dereference_check   kfree_rcu
        rcu_dereference_protected

bh::

        Critical sections       Grace period            Barrier

        rcu_read_lock_bh        call_rcu                rcu_barrier
        rcu_read_unlock_bh      synchronize_rcu
        [local_bh_disable]      synchronize_rcu_expedited
        [and friends]
        rcu_dereference_bh
        rcu_dereference_bh_check
        rcu_dereference_bh_protected
        rcu_read_lock_bh_held

sched::

        Critical sections       Grace period            Barrier

        rcu_read_lock_sched     call_rcu                rcu_barrier
        rcu_read_unlock_sched   synchronize_rcu
        [preempt_disable]       synchronize_rcu_expedited
        [and friends]
        rcu_read_lock_sched_notrace
        rcu_read_unlock_sched_notrace
        rcu_dereference_sched
        rcu_dereference_sched_check
        rcu_dereference_sched_protected
        rcu_read_lock_sched_held


SRCU::

        Critical sections       Grace period            Barrier

        srcu_read_lock          call_srcu               srcu_barrier
        srcu_read_unlock        synchronize_srcu
        srcu_dereference        synchronize_srcu_expedited
        srcu_dereference_check
        srcu_read_lock_held

SRCU: Initialization/cleanup::

        DEFINE_SRCU
        DEFINE_STATIC_SRCU
        init_srcu_struct
        cleanup_srcu_struct

All: lockdep-checked RCU-protected pointer access::

        rcu_access_pointer
        rcu_dereference_raw
        RCU_LOCKDEP_WARN
        rcu_sleep_check
        RCU_NONIDLE

Okay, listing these APIs is a bit confusing.

The mysterious veil of RCU is initially unveiled, and then it will be a bit difficult to pick up clothes inside. After all, the implementation mechanism behind RCU is really difficult. So, the question is coming, do you want to be a man who sees the king? Please pay attention.

reference

Documentation/RCU
What is RCU, Fundamentally?
What is RCU? Part 2: Usage
RCU part 3: the RCU API
Introduction to RCU

Welcome to pay attention to the public number, and continue to share the core mechanism articles in graphic form

Guess you like

Origin www.cnblogs.com/LoyenWang/p/12681494.html