ARM中的浮点运算测试

对于ARMv8-A/R和ARMv7-A/R中浮点运算的介绍可以在这里看到ARM Floating Point

其中,个人比较在意的是SIMD(对于ARM来说大致就是NEON吧)对浮点运算的支持。
ARMv8在AArch64模式下对于IEEE 754的支持是比较好的:

Floating-point support in AArch64 state SIMD is IEEE 754-2008 compliant with:
Configurable rounding modes
Configurable Default NaN behavior
Configurable Flush-to-zero behavior
Floating-point computation using AArch32 Advanced SIMD instructions remains unchanged from Armv7.

而ARMv7就略差一些:

The Armv7-A/R Advanced SIMD extension (NEON) offers single-precision floating-point support and performs IEEE 754 floating-point arithmetic with the following restrictions:
Denormalized numbers are flushed to zero
Only default NaNs are supported
The Round to Nearest rounding mode is used
Untrapped floating-point exception handling is used for all floating-point exceptions

所以在ARMv7平台上对浮点运算采用NEON进行加速的时候,要非常注意精度是否足够! 以SLEEF Vectorized Math Library为例,在SLEEF的AArch32 reference中便随处可见这样的描述:

This function may less accurate than the scalar function since AArch32 NEON is not IEEE 754-compliant.

此处应复习一下关于IEEE float的知识,图源CSAPP:
IEEE float
就以Denormalized numbers are flushed to zero这一项来做个测试吧,C和NEON的代码分别如下:

#define ITER_NUM 130 

void test_float() 
{
    
    
    float x = 0.5;
	float y = 1.5;

	//第一次循环
	int i;
	for (i = 0; i < ITER_NUM; i++)
		y *= x;

	//第二次循环只是为了显示好看
	for (i = 0; i < 10; i++)
		y *= 10000;
	printf("y %d\r\n", (int)y);
}

void test_float_neon()
{
    
    
    float y[2];

    float32x2_t fvecx = vdup_n_f32(0.5);
    float32x2_t fvecy = vdup_n_f32(1.5);

    int i;
	for (i = 0; i < ITER_NUM; i++)
        fvecy = vmul_f32(fvecx, fvecy);

    vst1_f32(y, fvecy);
    for (i = 0; i < 10; i++)
        y[0] *= 10000;

    printf("y %d\r\n", (int)y[0]);
}

IEEE 574规范下32-bit float的最小的规范化数是

#define FLT_MIN         1.175494351e-38F        /* min positive value */

ITER_NUM定义为130时,经过第一次循环后y的值大概是1.102e-39,显然已经超出了规范化数能够表示的范围,那么在ARMv7平台上test_float_neon()最终的打印输出应该就是0,而在ARMv8平台AArch64模式下的输出应该与C语言版本test_float()一致且非0。

用QEMU中的mcimx6ul-evk模拟ARMv7平台,virt -cpu cortex-a57模拟ARMv8平台,结果确实如此。

BTW,假如用VS想达到的Denormalized numbers are flushed to zero效果,可以这样写:

#include <xmmintrin.h>
#include <pmmintrin.h>
#include <stdio.h>

int main()
{
    
    
	_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
	_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);

	test_float();
	return 0;
}

Guess you like

Origin blog.csdn.net/u013213111/article/details/111768821