On-device GPU opencl cast operator

In tflite, you may often encounter situations where cast is not supported. This should be mainly because the cast support in opencl is not good.

Opencl does not support the direct cast of the pack data type, but can only cast element by element, for example, it cannot be cast like this:

int4 src = in[0];
half4 dst = src;

but can

int4 src = in[0];
half4 dst;
dst.s0=src.s0;
dst.s1=src.s1;
dst.s2=src.s2;
dst.s3=src.s3;

In addition, even so, many data types cannot be directly converted. For example, long or bool cannot be directly cast to half, and must be transferred based on other data types. Some opencl compilers support the use of a common data type such as float for transfer cast, but some must use multiple data types for transfer.

My solution here is to introduce an intermediate data type for the src and dst data types respectively. First, create an intermediate data type mapping map, the content of which is integer types such as bool, char, long, etc. The intermediate data type is int, and for half, float , the double floating-point data type uses float as the intermediate data type.

Then, create a vector<dtype> and store the four elements of src_dtype, src_mid_dtype, dst_mid_dtype, and dst_dtype. Then remove adjacent and identical elements of this array to avoid redundant conversions.

According to this vector, the variable and assignment process of the automatic code-gen intermediate conversion are operated based on the string.

Guess you like

Origin blog.csdn.net/u013701860/article/details/128417525