Detailed explanation of gcc inline assembly

Sometimes we want to use inline assembly in C/C++ code because there is no corresponding function or syntax available in C. For example, when I recently wrote an FIR program on ARM, I needed to saturate the final result, but gcc did not provide a function like ssat, so I had to embed assembly instructions in the C code.


1. Getting Started

The biggest problem with embedding assembly in C is how to associate C language variables with instruction operands. Of course, gcc has helped us figure it out. Below is a simple example.

asm(“fsinx %1, %0”:”=f”(result):”f”(angle));

We don't need to pay attention to what the fsinx instruction does here; just know that this instruction requires two floating-point registers as operands. As a full-time gcc compiler dealing with C language, it has no way to know what kind of operands the fsinx assembly instruction needs. This requires the programmer to inform gcc related information. The method is the "=f" and "f" after the instruction. ", indicating that these are two floating-point register operands. This is called the operand rule (constraint). A rule preceded by "=" indicates that this is an output operand, otherwise it is an input operand. The parentheses after constraint are the variables associated with the register. This way gcc knows how to convert this embedded assembly statement into an actual assembly instruction:

  • fsinx: assembly instruction name

  • %1, %0: Assembly instruction operands

  • "=f" (result): Operand %0 is a floating-point register, which is associated with the variable result (for output operands, "association" means that gcc will store the contents of register %0 after executing this assembly instruction sent to the variable result)

  • "f" (angle): Operand %1 is a floating-point register that is associated with the variable angle (for input operands, "association" means that gcc will read the value of the variable angle before executing this assembly instruction into register %1)

So this embedded assembly is translated into at least three assembly instructions (non-optimized):

  1. Load the value of the angle variable into register %1

  2. fsinx assembly instruction, source register %1, destination register %0

  3. Store the value of register %0 in the variable result

Of course, the above statement may not apply at high optimization levels; for example, the source operand may already be in a floating-point register.

Here we also see the meaning of the "=" sign before the constraint: gcc needs to know whether this operand is loaded from a variable to a register before executing embedded assembly, or stored from a register to a variable after execution.

Common constraints are the following (see the gcc manual for more details):

  • m memory operand

  • r register operand

  • i immediate operand (integer)

  • f floating point register operand

  • F immediate operand (floating point)

The basic format of embedded assembly can also be seen from this chestnut:

asm("Assembly instruction":"=output operand rule"(association variable):"input operand rule"(association variable));

The output operand must be an lvalue; this is obvious.


2. Multiple operands, or no output operands

What if an instruction has multiple input or output operands? For example, arm has many instructions that are three-operand instructions. This time separate multiple rules with commas:

asm(“add %0, %1, %2”:”=r”(sum):”r”(a), “r”(b));

Each operand rule corresponds to operands %0, %1, %2 in order.

For the case of no output operand, there is no output rule after the assembly instruction, so two consecutive colons appear, followed by the input rule.

 

3. Input-output (or read-write) operands

Sometimes an operand is both input and output, such as this instruction under x86:

add %eax, %ebx

Note that the instructions use AT&T format instead of Intel format. Register ebx acts as both an input operand and an output operand. For such operands, use the "+" character before the rule:

asm("add %1, %0" : "+r"(a) : "r"(b));

Corresponding to the C language statement a=a+b.

Note that such operands cannot use the "=" symbol, because gcc sees the "=" symbol and thinks that this is a single-output operand, so it will not prepend the value of variable a when converting embedded assembly to real assembly. Load into register %0.

Another way is to logically split the read-write operand into two operands:

asm(“add %2, %0” : “=r”(a) : “0”(a), “r”(b));

对“逻辑”输入操作数1指定数字规则”0”,表示这个逻辑操作数占用和操作数0一样的“位置”(占用同一个寄存器)。这种方法的特点是可以将两个“逻辑”操作数关联到两个不同的C语言变量上:

asm("add %2, %0" : "=r"(c) : "0"(a), "r"(b));

对应于C程序语句c=a+b。

数字规则仅能用于输入操作数,且必须引用到输出操作数。拿上例来说,数字规则”0”位于输入规则段,且引用到输出操作数0,该数字规则自身占用操作数计数1。

这里要注意,通过同名C语言变量是无法保证两个操作数占用同一“位置”的。比如下面这样的写法是不行的:

(错误写法)asm(“add %2, %0”:”=r”(a):”r”(a), “r”(b));

 

4. 指定寄存器

有时候我们需要在指令中使用指定的寄存器;典型的栗子是系统调用,必须将系统调用码和参数放在指定寄存器中。为了达到这个目的,我们要在声明变量时使用扩展语法:

register int a asm(“%eax”) = 1;              // statement 1

register int b asm(“%ebx”) = 2;              // statement 2

asm("add %1, %0" : "+r"(a) : "r"(b));         // statement 3

注意只有在执行汇编指令时能确定a在eax中,b在ebx中,其他时候a和b的存放位置是不可知的。

另外,在这么用的时候要注意,防止statement 2在执行时覆盖了eax。例如statement 2改成下面这句:

register int b asm(“%ebx”) = func();

函数调用约定会将func()的返回值放在eax里,于是破坏了statement 1对a的赋值。这个时候可以先用一条语句将func返回值放在临时变量里:

int t = func();

register int a asm(“%eax”) = 1;              // statement 1

register int b asm(“%ebx”) = t;              // statement 2

asm("add %1, %0" : "+r"(a) : "r"(b));         // statement 3

 

5. 隐式改变寄存器

有的汇编指令会隐含修改一些不在指令操作数中的寄存器,为了让gcc知道这个情况,将隐式改变寄存器规则列在输入规则之后。下面是VAX机上的栗子:

asm volatile(“movc3 %0,%1,%2”

                : /* no outputs */

                :”g”(from),”g”(to),”g”(count)

                :”r0”,”r1”,”r2”,”r3”,”r4”,”r5”);


(movc3是一条字符块移动(Move characters)指令)

这里要注意的是输入/输出规则中列出的寄存器不能和隐含改变规则中的寄存器有交叉。比如在上面的栗子里,规则“g”中就不能包含r0-r5。以指定寄存器语法声明的变量,所占用的寄存器也不能和隐含改变规则有交叉。这个应该好理解:隐含改变规则是告诉gcc有额外的寄存器需要照顾,自然不能和输入/输出寄存器有交集。

另外,如果你在指令里显式指定某个寄存器,那么这个寄存器也必须列在隐式改变规则之中(有点绕了哈)。上面我们说过gcc自身是不了解汇编指令的,所以你在指令中显式指定的寄存器,对gcc来说是隐式的,因此必须包含在隐式规则之中。另外,指令中的显式寄存器前需要一个额外的%,比如%%eax。

 

6. volatile

asm volatile通知gcc你的汇编指令有side effect,千万不要给优化没了,比如上面的栗子。

如果你的指令只是做些计算,那么不需要volatile,让gcc可以优化它;除此以外,无脑给每个asm加上volatile或者是个好办法。

本文转发自:

https://www.cnblogs.com/byeyear/p/4675049.html


轻轻一扫  欢迎关注~

640?wx_fmt=jpeg

640?wx_fmt=gif


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324846825&siteId=291194637