1. Write the first program Hello CUDA
Programs generally written in the CPU:
#include <stdio.h>
void hello_from_cpu()
{
printf("Hello World from the CPU!\n");
}
int main(void)
{
hello_from_cpu();
return 0;
}
Program written using CUDA
#include <stdio.h>
__global__ void hello_from_gpu()
{
printf("Hello World from the GPU!\n");
}
int main(void)
{
hello_from_gpu<<<1, 1>>>();
cudaDeviceSynchronize();
return 0;
}
2. compile
After writing the program, start to compile two compilation methods
1.nvcc
nvcc -arch=compute_72 -code=sm_72 hello_cuda.cu -o hello_cuda -run
2.Makefile
Makefile contents:
TEST_SOURCE = hello_cuda.cu
TARGETBIN := ./hello_cuda
CC = /usr/local/cuda/bin/nvcc
$(TARGETBIN):$(TEST_SOURCE)
$(CC) $(TEST_SOURCE) -o $(TARGETBIN)
.PHONY:clean
clean:
-rm -rf $(TARGETBIN)
-rm -rf *.o
Command line input: make
Then generate an executable file
Then enter: ./hello_cuda
to run the file
Then use nvprof to view performance:
nvprof ./hello_cuda