CUDA programming (3): Hello world

CUDA programming (3): Hello world

CUDA programming

CUDA is the abbreviation of Compute Unified Device Architecture. It was launched by Nvidia in 2007. The original intention was to add an easy-to-use programming interface to the GPU, so that developers do not need to learn complex shading languages ​​or graphics processing primitives.

CUDA provides two layers of API for developers to use:

  • CUDA driver: Low-level API, more difficult to use, but provides more control over GPU devices.
  • CUDA runtime: A set of high-level APIs based on CUDA driver, which is easier to use.

Hello world

To learn any programming language, you usually start with the Hello world program. Here is the Hello world program code for CUDA programming:

#include<stdio.h>
	
__global__ void hello_world(void){
    
    
	printf("GPU:Hello World!\n");
}

int main(void){
    
    
	// CPU:Hello World!
	printf("CPU:Hello World!\n");

	// GPU:Hello World!
	hello_world<<<1, 10>>>();
	// 错误处理
	cudaError_t err = cudaGetLastError();
	if (err != cudaSuccess) {
    
    
    		printf("CUDA Error: %s\n", cudaGetErrorString(err));
    		// Possibly: exit(-1) if program cannot continue....
	} 
	// 函数cudaDeviceReset()用来显式地释放和清空当前进程中与当前设备有关的所有资源。
	cudaDeviceReset();

	return 0;
}

Usually, when the CPU calls a kernel function, it specifies the number of thread blocks to execute the kernel function and the number of threads in each thread block. This also means that the content in the kernel function will be executed in parallel by the number of thread blocks × the number of threads in each thread block! hello_world <<<1, 10>>>();It calls 10 threads, executes the above hello_world program, and prints out 10 GPUs: Hello World!, this is SIMD, that is, single instruction multiple threads, multiple threads execute the same instruction.

In the Linux system, use nvidia-smithe command to check whether there is an NVIDIA accelerator card:
insert image description here

nvcc -VCheck if the nvcc compiler is installed correctly:
insert image description here

If there are normal outputs, it means that the hardware and software environment has been configured. Run the following command:

# 编译
nvcc -arch sm_50 hello_world.cu -o hello_world
# 运行
./hello_world

The result is shown in the figure below:
insert image description here
Note : Check your graphics card’s computing power (computing power) and modify sm_50. If the computing power used is too high, the compilation may not report an error, but the following error will appear during execution:

CUDA Error: no kernel image is available for execution on the device

Guess you like

Origin blog.csdn.net/weixin_43603658/article/details/129912224