Floating Point Registers and Instructions

register

AVX floating point architecture allows data to be stored in 16 YMM registers

255 127 0
%ymm0 %xmm0 1st FP arg. return value
% ymm1 %xmm1 2nd FP parameters
% ymm2 %xmm2 3rd FP parameters
%ymm3 %xmm3 4th FP parameter
% ymm4 %xmm4 5th FP parameter
% ymm5 % xmm5 6th FP parameter
% ymm6 %xmm6 7th FP Parameters
%ymm7 % xmm7 8th FP parameter
% ymm8 % xmm8 caller save
% ymm9 %xmm9 caller save
% ymm10 %xmm10 caller save
%ymm11 %xmm11 caller save
%ymm12 %xmm12 caller save
%ymm13 %xmm13 caller save
% ymm14 %xmm14 caller save
% ymm15 %xmm15 caller save

media register. These registers are used to hold floating point data. Each YMM register holds 32 bytes. The lower 16 bytes can be accessed as XMM registers

Floating point transfer and conversion operations

instruction source Purpose describe
vmovss M32 X Transmit single precision numbers
vmovss X M32 Transmit single precision numbers
vmovsd M64 X send double precision
vmovsd X M64 send double precision
vmovaps X X Delivers aligned packed single precision numbers
vmovapd X X Delivers an aligned packed double

Floating point transfer instructions. These operations transfer values ​​between memory and registers and between a pair of registers (X: XMM registers (e.g. %xmm3); M32: 32-bit memory range; M64: 64-bit memory range)

instruction source Purpose describe
vcvttss2si X/M32 R32 Convert a single-precision number to an integer using truncation
vcvttsd2si X/M64 R32 Convert Double to Integer by Truncation
vcvttss2siq X/M32 R64 Convert a single-precision number to a quad-word integer using truncation
vcvttsd2siq X/M64 R64 Convert double to quadword integer by truncation

Two-operand floating-point conversion instructions. These operations convert floating point numbers to integers (X: XMM registers (eg %xmm3); R32: 32-bit general-purpose registers (eg %eax); R64: 64-bit general-purpose registers (eg %rax); M32: 32-bit memory ranges; M64: 64-bit memory range)

instruction source 1 source 2 Purpose describe
vcvtsi2ss M32/R32 X X convert integer to single precision
vcvtsi2sd M32/R32 X X convert integer to double
vcvtsi2ssq M64/R64 X X Convert quadword integer to single precision
vcvtsi2sdq M64/R64 X X Convert quadword integer to double precision

三操作数浮点转换指令。这些操作将第一个源的数据类型转换成目的数据类型。第二个源值对结果的低位字节没有影响(X:XMM寄存器(例如%xmm3);M32:32位内存范围;M64:64位内存范围)

gcc实现单精度与双精度的转换需要单独说明(就不具体解释了)

Conversion from single to double precision
vunpcklps %xmm0, %xmm0, %xmm0   Replicate first vector element
vcvtps2pd %xmm0, %xmm0          Convert two vector elements to double
Conversion from double to single precision
vmovddup %xmm0, %xmm0            Replicate first vector element
vcvtpd2psx %xmm0, %xmm0          Convert two vector elements to single

运算操作

标量avx2浮点指令。每条指令有一个(S1)或两个(S1,S2)源操作数,和一个目的操作数。第一个源操作数S1可以是一个XMM寄存器或一个内存位置。第二个源操作数和目的操作数都必须是XMM寄存器。每个操作都有一条针对单精度的指令和一条针对双精度的指令。结果存放在目的寄存器中。

单精度 双精度 效果 描述
vaddss vaddsd D<—S2+S1 浮点数加
vsubss vsubsd D<—S2-S1 浮点数减
vmulss vmulsd D<—S2xS1 浮点数乘
vdivss vdivsd D<—S2/S1 浮点数除
vmaxss vmaxsd D<—max(S2,S1) 浮点数最大值
vminss vminsd D<—min(S2,S1) 浮点数最小值
sqrtss sqrtsd D<— 2 浮点数平方根

位级操作

单精度 双精度 效果 描述
vxorps vorpd D<—S2^S1 位级异或(EXCLUSIVE–OR)
vandps andpd D<—S2&S1 位级与(AND)

对封装数据的位级操作(这些指令对一个XMM寄存器中的所有128位进行布尔操作)

比较操作

指令 基于 描述
ucomiss S1,S2 S2-S1 比较单精度值
ucomisd S1,S2 S2-S1 比较双精度值

参数S2必须在XMM寄存器中,而S1可以在XMM寄存器中,也可以在内存中

条件码的设置如下:

顺序S2:S1 CF ZF PF(奇偶标志位)
无序的(NaN) 1 1 1
S2 < S1 1 0 0
S2 = S1 0 1 0
S2 > S1 0 0 0

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325920655&siteId=291194637