Introduction to some common instruction sets of SSE2

Straight to the point, I learned OpenCV's FAST algorithm some time ago. There are many SSE2 instruction sets in the middle, and I am deeply confused. Below I will introduce some instruction sets I learned during the learning process to you, I hope it can be helpful to you!

__m128i is called an integer of 128bits. When assigning it, you can call __m128i_mm_set1_epi8 or __m128i_mm_set1_epi8 16, etc. The former is to set 128bits to 16 integer values ​​of 8bits, the latter example is to set 128bits to 8 integer values ​​of 16bits .

_mm_loadu_si128 means: Loads 128-bit value; that is, loads a 128-bit value.
_mm_max_epu8 (a, b) means: compare the corresponding unsigned 8bits integers in a and b, take the larger value
, and repeat this process 16 times . That is: r0=max(a0,b0),...,r15=max(a15,b15)
_mm_min_epi8(a,b) means: the general meaning is the same as above, the difference is that this comparison is a signed 8bits integer.
_mm_setzero_si128 means: assign the value of 128bits to 0.
_mm_subs_epu8(a,b) means:
subtract the corresponding 8bits numbers in a and b, r0= UnsignedSaturate(a0-b0),..., r15= UnsignedSaturate(a15 - b15)
_mm_adds_epi8(a,b) means:
add , r0=SingedSaturate(a0+b0),...,r15=SingedSaturate(a15+b15).

_mm_unpackhi_epi64(a,b) means: the high 64 bits of a and b are interleaved, and the low 64 bits are rounded off.
_mm_srli_si128(a, imm) means: move a logically to the right by imm bits, and fill the high bits with 0.
_mm_cvtsi128_si32(a) means: assign the lower 32 bits of a to a 32bits integer, and the return value is r=a0;
_mm_xor_si128(a,b) means: perform bitwise XOR of a and b, that is, r=a^b .
_mm_or_si128(a,b) means: OR a and b, that is, r=a|b.
_mm_and_si128(a,b) means: perform AND operation on a and b, that is, r=a&b.
_mm_cmpgt_epi8(a,b) means: compare whether each 8bits integer of a is greater than the 8bits integer of the corresponding position of b, if it is greater, return 0xffff, otherwise return 0x0.
That is, r0=(a0>b0)?0xff:0x0 r1=(a1>b1)?0xff:0x0...r15=(a15>b15)?0xff:0x0

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324810290&siteId=291194637