ARM下的原子操作实现原理

http://blog.csdn.net/u013686019/article/details/78235624

ARM下的原子操作实现原理

本文的重点是学习C内嵌汇编的语法和ldrex/strex指令。

1、atomic_t类型定义

typedef struct {
	int counter;
} atomic_t;

把整型原子操作定义为结构体，让原子函数只接收atomic_t类型的参数进而确保原子操作只与这种特殊类型数据一起使用，同时也保证了该类型的数据不会被传递给非原子函数。

2、定义并初始化一个atomic_t变量

atomic_t v = ATOMIC_INIT(0);

#define ATOMIC_INIT(i)	{ (i) }

3、基本操作

atomic_inc(v); // 原子变量自增1
atomic_dec(v); // 原子变量自减1

4、atomic_inc()函数的实现

static inline void atomic_inc(atomic_t *v)
{
atomic_add_return(1, v);
}

#define atomic_inc_return(v)	atomic_add_return(1, (v))
static inline int atomic_add_return(int i, atomic_t *v)
{
	unsigned long tmp;
	int result;
	smp_mb();

	__asm__ __volatile__("@ atomic_add_return\n"
		"1:	ldrex	%0, [%3]\n"
		"	add	%0, %0, %4\n"
		"	strex	%1, %0, [%3]\n"
		"	teq	%1, #0\n"
		"	bne	1b"
		: "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
		: "r" (&v->counter), "Ir" (i)
		: "cc");

	smp_mb();
	return result;
}

4.1 C函数内嵌汇编

4.1.1 内嵌汇编语法

__asm__内嵌汇编关键字，告知编译器下述语句为汇编代码
__volatile__告知编译器不要优化(比如重组优化)下述汇编语句
语法的格式：

__asm__ (
	"asm code 1\n" // instruction list
	"asm code 2\n"
	"asm code n"
	: Output Operands // 把汇编指令的数值输出到C代码的变量中
	: Input  Operands
	: clobbers // 告知编译器这条指令会修改什么值
);

变量列表中常见符号：
"+"：操作数可读可写
"="：操作数只写
"&"：常用于输出操作，表示输出操作不能使用输入操作使用过的寄存器，只能+&或=&方式使用
"r"：操作数是任何可用的通用寄存器
"m"：操作数是内存变量
"p"：操作数是一个合法的内存地址
"I"：0~31之间的立即数
"i"：操作数是立即数
"Q"： A memory address which uses a single base register with no offset
"o"：操作数是内存变量，但其寻址方式必须是偏移量类型的，即基址寻址或基址加变址寻址
"V"：操作数是内存变量，但其寻址方式非偏移量类型

4.1.2 atomic_add_return中汇编分析

__asm__ __volatile__("@ atomic_add_return\n"
	"1:	ldrex	%0, [%3]\n"
	"	add	%0, %0, %4\n"
	"	strex	%1, %0, [%3]\n"
	"	teq	%1, #0\n"
	"	bne	1b"
	: "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
	: "r" (&v->counter), "Ir" (i)
	: "cc");

%0 <-- result
%1 <-- tmp
%3 <-- v->counter的地址
%4 <-- i
注意：此时C变量的数据都已经放至寄存器中
(1) ldrex %0, [%3]
独占式地加载(Load-Exclusive)v->counter的地址，把它的值放到result中，并更新exclusive monitor(s)
用C描述就是：
result = v->counter
(2) add %0, %0, %4
result = result + i
(3) strex %1, %0, [%3]
独占式地保存(Store-Exclusive)数据至v->counter的地址，数据来自result，操作结果(成功/失败)保存在tmp中
用C描述就是：
v->counter = result
(4) teq %1, #0
检测strex的操作是否成功
(5) bne 1b
strex的操作失败的话，向后跳转到指定标号(jump to 1 label backward)处重新执行

(6) "cc"

"cc"是一个特殊的参数，用来标明汇编代码会修改标志寄存器(flags register)
在某些机器平台上，GCC通过一个特殊的硬件寄存器表征条件类型的代码，"cc"就是这个特殊寄存器的名字
某些机器平台没有上述功能，"cc"会被忽略，不起作用。

附：

1、How to Use Inline Assembly Language in C Code

2、ldrex和strex简介

LDREX and STREX
The LDREX and STREX instructions split the operation of atomically updating memory into
two separate steps. Together, they provide atomic updates in conjunction with exclusive
monitors that track exclusive memory accesses.
Load-Exclusive and Store-Exclusive must only access memory regions marked as
Normal.

LDREX
The LDREX instruction loads a word from memory, initializing the state of the exclusive
monitor(s) to track the synchronization operation. For example, LDREX R1, [R0]
performs a Load-Exclusive from the address in R0, places the value into R1 and updates
the exclusive monitor(s).

STREX
The STREX instruction performs a conditional store of a word to memory. If the exclusive
monitor(s) permit the store, the operation updates the memory location and returns the
value 0 in the destination register, indicating that the operation succeeded. If the
exclusive monitor(s) do not permit the store, the operation does not update the memory
location and returns the value 1 in the destination register. This makes it possible to
implement conditional execution paths based on the success or failure of the memory
operation. For example, STREX R2, R1, [R0] performs a Store-Exclusive operation to the
address in R0, conditionally storing the value from R1 and indicating success or failure
in R2.

详见ARM Synchronization Primitives