CUDA PROGRAM STRUCTURE
A typical CUDA program structure consists of fi ve main steps:
1. Allocate GPU memories.
2. Copy data from CPU memory to GPU memory.
3. Invoke the CUDA kernel to perform program-specifi c computation.
4. Copy data back from GPU memory to CPU memory.
5. Destroy GPU memories.
In the simple program hello.cu, you only see the third step: Invoke the kernel. For
the remainder of this book, examples will demonstrate each step in the CUDA program
structure.
A typical CUDA program structure consists of five main steps:
1 allocation GPU memory.
2. Copy the data from the CPU memory to the GPU memory.
3. calls CUDA kernel execution program-specifi calculations.
4. copy the data from the GPU memory to the CPU memory.
The destruction of GPU memory.
. Hello in a simple program cu, you only see Step Three: Call the kernel. In the rest of the book, the program configuration example demonstrates CUDA each step.
#include <stdio.h> #include"cuda_runtime.h" __global__ void helloFromGPU(void) { printf("Hello World from GPU!\n"); } int main(void) { // hello from cpu printf("Hello World from CPU!\n"); helloFromGPU <<<10, 1 >>> (); cudaDeviceReset(); return 0; }