【Game Engine Architecture 6】
1、Data-Parallel Computations
A GPU is a specialized coprocessor designed specifically to accelerate those computations that involve a high degree of data parallelism. It does so by combining SIMD parallelism (vectorized ALUs) with MIMD parallelism (by employing a form of preemptive multithreading). NVIDIA coined the term single instruction multiple thread (SIMT) to describe this SIMD/MIMD hybrid design, all GPUs employ the general principles of SIMT parallelism in their designs.
In order for a computational task to be well-suited to execution on a GPU, the computations performed on any one element of the dataset must be independent of the results of computations on other elements.
void DotArrays_ref(unsigned count, float r[], const float a[], const float b[]) { for (unsigned i = 0; i < count; ++i) { // treat each block of four floats as a // single four-element vector const unsigned j = i * 4; r[i] = a[j+0]*b[j+0] // ax*bx + a[j+1]*b[j+1] // ay*by + a[j+2]*b[j+2] // az*bz + a[j+3]*b[j+3]; // aw*bw } }
Upper code, the computation performed by each iteration of this loop is independent of the computations performed by the other iterations.
2、Compute Kernels
GPGPU compute kernels are typically written in a special shading language which can be compiled into machine code that’s understandable by the GPU.
3、
4、
5、
6、