Linux内核通知链介绍

在阅读内核源码的时候，到处会看到通知链的身影。从技术上来讲，这并不是一个多么复杂、高深、难懂的部分，说白了就是一个单向链表的插入、删除和遍历等操作。但这部分是由协议栈头号大Boss----Alan Cox亲自主刀，足以说明这个基础特性的重要性，也有很多值得我们学习的地方。内核中通知链的基础文件就两个：

头文件include/linux/notifier.h
源文件kernel/notifier.c

头文件和源文件所有代码加起来不超过1000行，总体来说还是比较好懂。

刚才说过，通知链的原型就是一个单向链表，内核提供的通知链机制主要用于不同子系统之间通信，基于事件和优先级。往通俗里将，考虑这么一种场景：对于网卡驱动子系统来说，经常会发生的情况就是什么？网卡IP地址有变化，网卡状态有变化等等。那么如果有其他子系统，比如路由子系统对网卡IP地址变化这件事比较感兴趣，它该怎么去感知这件事儿呢？当然这种场景下，很多人第一直觉就是“订阅者-发布者”模型。不过确实是这样的，通知链机制可以算作是“订阅者-发布者”模型的一种。每个子系统都会有些一些重要事件，例如前面说的，网络驱动子系统网卡的事件，或者USB的状态事件等等，这些子系统都会提供一个自己的事件队列，这个队列都是其他函数提供的回调函数。当有事件发生时，子系统就会去遍历其事件队列上已经注册了的所有回调函数，这样就实现了“通知”的目的。说的云里雾里的，还是看图吧：

对系统A来说，它自己的通知队列上被被人注册了三个回调函数，那么当系统A的某个事件发生时，它必须去遍历自己的事件队列headA，然后依次去执行队列里每个回调函数(这么说不太准确，不一定每个函数都执行，后面解释)。对子系统B来说，情况是一样地。
内核里通知链队列里，每个元素都是一个通知块，原型如下：

// include/linux/notifier.h
typedef	int (*notifier_fn_t)(struct notifier_block *nb,unsigned long action, void *data);
struct notifier_block {
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
};

notifier_call：回调函数的指针，指向的函数是当事件发生时要执行的函数；
next：指向下一个回调函数的通知块；
priority：是事件发生时本函数(由notifier_call所指向)执行的优先级，数字越大优先级越高，越会先被执行。我们看到这个通知块的结构并不复杂，甚至可以说是已经非常简单明了，每一个这样的通知块串起来就是我们所说的通知链了。
Linux内核提供了四类通知链：原子通知链、可阻塞通知链、原始通知链和SRCU通知链，它们的主要区别就是在执行通知链上的回调函数时是否有安全保护措施。下面我们分别看一下这四类通知链：
1.原子通知链(Atomic notifier chains)：
原子通知链的链表头定义如下：

struct atomic_notifier_head {
	spinlock_t lock;
	struct notifier_block __rcu *head;
};

我们可以看到原子通知链采用的是自旋锁，通知链元素的回调函数(当事件发生时要执行的函数)只能在中断/原子上下文中运行，而且不允许阻塞。

2.可阻塞通知链(Blocking notifier chains)：
可阻塞的通知链的链表头定义如下，是用信号量实现回调函数的加锁。

struct blocking_notifier_head {
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
};

3.原始通知链(Raw Notifier Chains)：
顾名思义，没有任何安保措施，这类通知链链表的回调，注册或注销没有任何限制，所有锁定和保护必须由调用者提供。定义如下：

struct raw_notifier_head {
	struct notifier_block __rcu *head;
};

4.SRCU通知链：

阻塞通知链的一个变体，使用时具有相同的限制；使用互斥体对链表的访问进行限制。定义如下：

struct srcu_notifier_head {
	struct mutex mutex;
	struct srcu_struct srcu;
	struct notifier_block __rcu *head;
};

关于四大类通知链详细的描述在notifier.h文件头部已经有非常详细的描述和说明了，这里我就浪费笔墨了，大家看源代码里的英文注释完全足够了。

这四类通知链，我们该怎么用这才是我需要关心的问题。在定义自己的通知链的时候，心里必须明确，自己需要一个什么样类型的通知链，是原子的、可阻塞的还是一个原始通知链。内核中用于定义并初始化不同类通知链的宏分别是：

#define ATOMIC_NOTIFIER_INIT(name) {				\
	.lock = __SPIN_LOCK_UNLOCKED(name.lock),	\
	.head = NULL }
#define BLOCKING_NOTIFIER_INIT(name) {				\
	.rwsem = __RWSEM_INITIALIZER((name).rwsem),	\
	.head = NULL }
#define RAW_NOTIFIER_INIT(name)	{				\
	.head = NULL }
/* srcu_notifier_heads cannot be initialized statically */

#define ATOMIC_NOTIFIER_HEAD(name)				\    //定义并初始化一个名为name的原子通知链
	struct atomic_notifier_head name =			\
		ATOMIC_NOTIFIER_INIT(name)
#define BLOCKING_NOTIFIER_HEAD(name)				\  //定义并初始化一个名为name的阻塞通知链  
	struct blocking_notifier_head name =			\
		BLOCKING_NOTIFIER_INIT(name)
#define RAW_NOTIFIER_HEAD(name)					\    //定义并初始化一个名为name的原始通知链
	struct raw_notifier_head name =				\
		RAW_NOTIFIER_INIT(name)

其中，SRCU通知链没有相关的宏来初始化，使用时需要手动定义。

其实ATOMIC_NOTIFIER_HEAD(mynotifierlist)和下面的代码是等价的，展开之后如下：

struct atomic_notifier_head mynotifierlist = {
    .lock = __SPIN_LOCK_UNLOCKED(mynotifierlist.lock),
    .head = NULL 
}

另外几个接口也类似。如果我们已经有一个通知链的对象，Linux还提供了一组用于初始化一个通知链对象的API：

#define ATOMIC_INIT_NOTIFIER_HEAD(name) do {	\
		spin_lock_init(&(name)->lock);	\
		(name)->head = NULL;		\
	} while (0)
	
#define BLOCKING_INIT_NOTIFIER_HEAD(name) do {	\
		init_rwsem(&(name)->rwsem);	\
		(name)->head = NULL;		\
	} while (0)
	
#define RAW_INIT_NOTIFIER_HEAD(name) do {	\
		(name)->head = NULL;		\
	} while (0)

这一组接口一般在下列格式的代码里见到的会比较多一点：

static struct atomic_notifier_head dock_notifier_list;
ATOMIC_INIT_NOTIFIER_HEAD(&dock_notifier_list);

OK，有了通知链只是第一步，接下来我们还需要提供往通知链上注册通知块、卸载通知块、以及遍历执行通知链上每个通知块里回调函数的基本接口，说白了就是单向链表的插入、删除和遍历，这样理解就可以了。
内核提供最基本的通知链的常用接口如下：

static int notifier_chain_register(struct notifier_block **nl,  struct notifier_block *n);
static int notifier_chain_unregister(struct notifier_block **nl, struct notifier_block *n);
static int __kprobes notifier_call_chain(struct notifier_block **nl, unsigned long val, void *v, int nr_to_call, int *nr_calls);

这最基本的三个接口分别实现了对通知链上通知块的注册、卸载和遍历操作，可以想象，原子通知链、可阻塞通知链和原始通知链一定会对基本通知链的操作函数再进行一次包装的，事实也确实如此。

往通知链里注册通知块的函数有：

extern int atomic_notifier_chain_register(struct atomic_notifier_head *nh,struct notifier_block *nb);
extern int blocking_notifier_chain_register(struct blocking_notifier_head *nh,struct notifier_block *nb);	
extern int raw_notifier_chain_register(struct raw_notifier_head *nh,struct notifier_block *nb);
extern int srcu_notifier_chain_register(struct srcu_notifier_head *nh,struct notifier_block *nb);

从通知链里注销通知块的函数有：

extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,struct notifier_block *nb);
extern int blocking_notifier_chain_unregister(struct blocking_notifier_head *nh,struct notifier_block *nb);
extern int raw_notifier_chain_unregister(struct raw_notifier_head *nh,struct notifier_block *nb);
extern int srcu_notifier_chain_unregister(struct srcu_notifier_head *nh,struct notifier_block *nb);

遍历通知链里的通知块，并调用通知块的回调函数有：

extern int atomic_notifier_call_chain(struct atomic_notifier_head *nh,unsigned long val, void *v);
extern int blocking_notifier_call_chain(struct blocking_notifier_head *nh,unsigned long val, void *v);
extern int raw_notifier_call_chain(struct raw_notifier_head *nh,unsigned long val, void *v);
extern int srcu_notifier_call_chain(struct srcu_notifier_head *nh,unsigned long val, void *v);

上述这三类通知链的基本API又构成了内核中其他子系统定义、操作自己通知链的基础。例如，Netlink定义了一个原子通知链，所以，它对原子通知链的基本API又封装了一层，以形成自己的特色：

/*net/netlink/af_netlink.c*/
...
static ATOMIC_NOTIFIER_HEAD(netlink_chain);
...
int netlink_register_notifier(struct notifier_block *nb){
    return atomic_notifier_chain_register(&netlink_chain, nb);
}
...
int netlink_unregister_notifier(struct notifier_block *nb){
    return atomic_notifier_chain_unregister(&netlink_chain, nb);
}
...

网络事件也有一个原子通知链：

/*net/core/netevent.c*/
/*
 *    Network event notifiers
 *
 *    Authors:
 * Tom Tucker <[email protected]>
 * Steve Wise <[email protected]>
 *
 *    This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version
 * 2 of the License, or (at your option) any later version.
 *
 *    Fixes:
 */

#include <linux/rtnetlink.h>
#include <linux/notifier.h>
#include <net/netevent.h>

static ATOMIC_NOTIFIER_HEAD(netevent_notif_chain);

/**
 *    register_netevent_notifier - register a netevent notifier block
 *    @nb: notifier
 *
 *    Register a notifier to be called when a netevent occurs.
 *    The notifier passed is linked into the kernel structures and must
 *    not be reused until it has been unregistered. A negative errno code
 *    is returned on a failure.
 */
int register_netevent_notifier(struct notifier_block *nb)
{
    int err;

    err = atomic_notifier_chain_register(&netevent_notif_chain, nb);
    return err;
}

/**
 *    netevent_unregister_notifier - unregister a netevent notifier block
 *    @nb: notifier
 *
 *    Unregister a notifier previously registered by
 *    register_neigh_notifier(). The notifier is unlinked into the
 *    kernel structures and may then be reused. A negative errno code
 *    is returned on a failure.
 */

int unregister_netevent_notifier(struct notifier_block *nb)
{
    return atomic_notifier_chain_unregister(&netevent_notif_chain, nb);
}

/**
 *    call_netevent_notifiers - call all netevent notifier blocks
 * @val: value passed unmodified to notifier function
 * @v: pointer passed unmodified to notifier function
 *
 *    Call all neighbour notifier blocks. Parameters and return value
 *    are as for notifier_call_chain().
 */

int call_netevent_notifiers(unsigned long val, void *v)
{
    return atomic_notifier_call_chain(&netevent_notif_chain, val, v);
}
EXPORT_SYMBOL_GPL(register_netevent_notifier);
EXPORT_SYMBOL_GPL(unregister_netevent_notifier);
EXPORT_SYMBOL_GPL(call_netevent_notifiers)

而阻塞通知链里的SRCU通知链，由于使用条件较苛刻，限制条件较多，所以使用的机会不是很多，除非你特别清楚这种类型的通知链的适用场合，在2.6.32的内核里只有cpufreq.c在用这种类型的通知链。

在写驱动程序时，有时候会用的blocking_notifier，现在我们就来分析其具体的使用详细过程；
首先定义一个struct blocking_notifier_head结构体对象：

struct blocking_notifier_head {
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
};

BLOCKING_NOTIFIER_HEAD(hello_notifier_head);

这个宏的定义如下：

#define BLOCKING_NOTIFIER_HEAD(name)				\
	struct blocking_notifier_head name =			\
		BLOCKING_NOTIFIER_INIT(name)

将这个宏展开如下：

struct blocking_notifier_head hello_notifier_head = {
	.rwsem = __RWSEM_INITIALIZER((hello_notifier_head).rwsem),
	.head = NULL 
}

定义一个struct notifier_block结构体变量，并将其注册到通知链中；

static int hello_print_info(struct notifier_block *this, unsigned long event,void *ptr){
	printk("hello_print_info excute!\n");
	return NOTIFY_DONE;
}

static struct notifier_block hello_notifier = {
	.notifier_call = hello_print_info,
};

将其注册到通知链中：

blocking_notifier_chain_register(&hello_notifier_head,&hello_notifier);

现在分析，将block_notifier注册到通知链中的过程函数的定义如下：

int blocking_notifier_chain_register(struct blocking_notifier_head *nh,struct notifier_block *n)	{
	int ret;

	/* 在系统启动期时，进程切换尚未工作且中断必须保持禁用时，调用下面的代码；这种情况下，
	 * 我们不能调用down_write()函数。
	 */
	if (unlikely(system_state == SYSTEM_BOOTING))
		return notifier_chain_register(&nh->head, n);
	
	/* 正常情况下执行下面的程序 */
	down_write(&nh->rwsem);		 //获取信号量
	/* 调用notifier_chain_register()函数 */
	ret = notifier_chain_register(&nh->head, n);
	up_write(&nh->rwsem);		 //是否信号量
	return ret;
}

来看notifier_chain_register()函数的定义：

static int notifier_chain_register(struct notifier_block **nl,struct notifier_block *n)	{
	while ((*nl) != NULL) {	//如果以前已经添加过block，这就循环遍历
		//如果新加入的block的priority值比当前的大，则退出循环；
		//否则，继续往下遍历，找比当前大的值，或者*nl=NULL，就退出循环
		if (n->priority > (*nl)->priority)
			break;
		nl = &((*nl)->next);
	}
	//将当前block的next指针指向*nl
	n->next = *nl;
	rcu_assign_pointer(*nl, n);
	return 0;
}

调用注册函数之后，可以知道通知链表的notifier_block成员head，首先指向的priority值最小的那个notifier_block块，依次类推，直到最大的。现在已经注册了通知块，看看当通知链接收到通知时，是怎么调用链表中的通知块的回调函数的。

int blocking_notifier_call_chain(struct blocking_notifier_head *nh,unsigned long val, void *v){
	/* 调用__blocking_notifier_call_chain()函数 */
	return __blocking_notifier_call_chain(nh, val, v, -1, NULL);
}

函数的参数介绍：
nh：指向阻塞通知程序链的指针；
val：未经修改的值传递给通知程序函数
v：指针未经修改地传递给通知函数

int __blocking_notifier_call_chain(struct blocking_notifier_head *nh,unsigned long val, void *v,int nr_to_call, int *nr_calls){
	int ret = NOTIFY_DONE;
	if (rcu_access_pointer(nh->head)) {
		/* 获取信号量 */
		down_read(&nh->rwsem);
		/* 调用notifier_call_chain()函数 */
		ret = notifier_call_chain(&nh->head, val, v, nr_to_call,nr_calls);
		/* 释放信号量 */
		up_read(&nh->rwsem);
	}
	return ret;
}

notifier_call_chain()函数的定义如下：

static int notifier_call_chain(struct notifier_block **nl,unsigned long val, void *v,int nr_to_call, int *nr_calls)		       
{
	int ret = NOTIFY_DONE;
	struct notifier_block *nb, *next_nb;
	/* 获取nl对应的notifier_block */
	nb = rcu_dereference_raw(*nl);
	/* 如果nb不为NULL，且nr_to_call不为0，循环遍历nb链表 */
	while (nb && nr_to_call) {
		next_nb = rcu_dereference_raw(nb->next);

#ifdef CONFIG_DEBUG_NOTIFIERS	//如果定义了CONFIG_DEBUG_NOTIFIERS宏，则会执行下面的函数
		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
			WARN(1, "Invalid notifier called!");
			nb = next_nb;
			continue;
		}
#endif	
		/* 调用具体block的回调函数 */
		ret = nb->notifier_call(nb, val, v);
		
		/* 默认传入的位NULL，不执行 */
		if (nr_calls)
			(*nr_calls)++;
		
		/* 如果具体通知块的回调函数返回值等于NOTIFY_STOP_MASK，则直接返回，不再循环遍历通知链上的通知块 */
		if (ret & NOTIFY_STOP_MASK)
			break;
		/* 则继续开始遍历下一个通知块 */
		nb = next_nb;
		nr_to_call--;
	}
	return ret;
}

函数已经做了注释，具体不再做过多解释；现在我们具体说明具体使用，还是使用阻塞通知链作为例子；程序如下：

模块init的定义如下：

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/notifier.h>

/* 定义通知链，并使用导出符号导出，方便其他模块使用 */
BLOCKING_NOTIFIER_HEAD(hello_notifier_chain);
EXPORT_SYMBOL(hello_notifier_chain);

int A_notifier_func(struct notifier_block *nb,unsigned long action, void *data){
    printk("A_notifier_func excute; priority is %d!\n",nb->priority);
    return NOTIFY_OK;
}
/* 通知块A */
struct notifier_block A_notifier = {
    .notifier_call = A_notifier_func,
    .priority = 0,
};

int B_notifier_func(struct notifier_block *nb,unsigned long action, void *data){
    printk("B_notifier_func excute; priority is %d!\n",nb->priority);
    return NOTIFY_OK;
}
/* 通知块B */
struct notifier_block B_notifier = {
    .notifier_call = B_notifier_func,
    .priority = 0,
};
/* 驱动入口函数 */
static int __init notifier_register_init(void)
{	
	/* 往通知链中注册通知块A */
    blocking_notifier_chain_register(&hello_notifier_chain, &A_notifier);
	/* 往通知链中注册通知块B */
    blocking_notifier_chain_register(&hello_notifier_chain, &B_notifier);
    return 0;
}

/* 驱动的出口函数 */
static void __exit notifier_register_exit(void)
{
    blocking_notifier_chain_unregister(&hello_notifier_chain, &A_notifier);
    blocking_notifier_chain_unregister(&hello_notifier_chain, &B_notifier);
}

module_init(notifier_register_init);
module_exit(notifier_register_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("notifier test by Haitao Cai");

通知链使用的文件内容如下：

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/notifier.h>
/* 声明下要使用的通知链 */
extern struct blocking_notifier_head hello_notifier_chain;

int C_notifier_func(struct notifier_block *nb,unsigned long action, void *data){
    printk("C_notifier_func excute; priority is %d!\n",nb->priority);
    return NOTIFY_OK;
}
/* 定义通知块C */
struct notifier_block C_notifier = {
    .notifier_call = C_notifier_func,
    .priority = 0,
};
/* 驱动的入口函数 */
static int __init notifier_usage_init(void)
{
	/* 注册通知块C */
    blocking_notifier_chain_register(&hello_notifier_chain, &C_notifier);
	/* call通知链上面的所有通知块,并调用通知块的回调函数 */
    blocking_notifier_call_chain(&hello_notifier_chain, 0, NULL);
    return 0;
}
/* 驱动的出口函数 */
static void __exit notifier_usage_exit(void)
{	
	/* 注销通知块C */
    blocking_notifier_chain_unregister(&hello_notifier_chain, &C_notifier);
}

module_init(notifier_usage_init);
module_exit(notifier_usage_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("notifier usage by Haitao Cai");

Makefile文件的内容如下：

KERN_DIR = /work/jz2440/linux-4.16.16/

all:
	make -C $(KERN_DIR) M=`pwd` modules 

clean:
	make -C $(KERN_DIR) M=`pwd` modules clean
	rm -rf modules.order

obj-m	+= notifer_init.o notifer_usage.o

将源文件和Makefile文件上传到服务器上面，编译，然后通过NFS传到开发板上面。执行如下命令：

insmod notifer_init.ko
insmod notifer_usage.ko

打印如下信息：

可以看出，三个通知块A、B和C的优先级设置的都是0，则调用通知块回调函数的顺序是注册通知块的先后顺序。

现在我们来修改三个通知块的优先级的设置，看看通知块的回调函数调用顺序和优先级的关系。

设置优先级如下：

A——5

B——2

C——9

编译模块，上传到开发板，重新插入模块，打印的信息如下：

从打印的信息可以得知，通知块的优先级priority设置的越大，其回调函数被调用越早。

现在来将通知块A的回调函数的返回值设置为NOTIFY_STOP_MASK，看看什么现象；打印如下：

从打印的信息可以得知，通知块C的回调函数没有被调用。

总结：Linux内核的通知链相对论来说还是挺简单的，但是，当我们在使用时要注意的地方有：

1.通知链有四种类型，根据具体情况选择合适的类型；

2.通知块的优先级越大，当处理通知链时，其回调函数被调用的越早；

3.当某个通知块的回调函数返回NOTIFY_STOP_MASK，则其优先级后的通知块的回调函数不会被调用。

Linux内核通知链介绍

猜你喜欢