table of Contents

Modify and copy files

Compile

Compile submodule

background

rodinia_3.1 is a benchmark for GPU-CPU experiments. It can perform kmeans, hotspot and other experiments on openmp, opencl, and cuda, but its compilation and running are quite laborious. The following is a record of my entire process from downloading to running ,For reference

step

Download and unzip

Download the compressed package http://www.cs.virginia.edu/~kw5na/lava/Rodinia/Packages/Current/rodinia_3.1.tar.bz2 , then upload it to the ubuntu server and unzip it

root@sundata:/data/szc# tar -jxvf rodinia_3.1.tar.bz2

Make sure you have installed cuda, opencl for cuda (for installation steps, please refer to the installation of OpenCL and OpenACC article ), gcc7 and g++7

Modify and copy files

We need to modify some files, configure some paths, and adapt my sm and compute structure

1. Enter the rodinia directory and modify the common/make.config file

root@sundata:/data/szc/rodinia_3.1# vim common/make.config

Modify CUDA_DIR, SDK_DIR and OPENCL_DIR to your own cuda directory, C directory under the opencl directory, and opencl directory

# CUDA toolkit installation path
CUDA_DIR = /usr/local/cuda-10.0

# CUDA toolkit libraries
CUDA_LIB_DIR := $(CUDA_DIR)/lib
ifeq ($(shell uname -m), x86_64)
     ifeq ($(shell if test -d $(CUDA_DIR)/lib64; then echo T; else echo F; fi), T)
         CUDA_LIB_DIR := $(CUDA_DIR)/lib64
     endif
endif

# CUDA SDK installation path
#SDK_DIR = $(HOME)/NVIDIA_GPU_Computing_SDK/C
SDK_DIR = /root/NVIDIA_GPU_Computing_SDK/C
#SDK_DIR =/if10/kw5na/NVIDIA_CUDA_Computing_SDK4/C


# OPENCL

# NVIDIA_DIR

OPENCL_DIR =/root/NVIDIA_GPU_Computing_SDK
OPENCL_INC = $(OPENCL_DIR)/OpenCL/common/inc
OPENCL_LIB = $(OPENCL_DIR)/OpenCL/common/lib -lOpenCL

# AMD_DIR

# OPENCL_DIR = /if10/kw5na/Packages/AMD-APP-SDK-v2.8-RC-lnx64
# OPENCL_INC = $(OPENCL_DIR)/include/
# OPENCL_LIB = $(OPENCL_DIR)/lib/x86_64/ -lOpenCL
#ifeq ($(shell uname -m), x86_64)
#     ifeq ($(shell if test -d $(OPENCL_DIR)/lib/x86_64/; then echo T; else echo F; fi), T)
#         OPENCL_LIB = $(OPENCL_DIR)/lib/x86_64/
#     endif
#endif

2. Then modify the common/common.mk file and change SM_VERSIONS to sm_60

.........
.SUFFIXES : .cu .cu_dbg_o .c_dbg_o .cpp_dbg_o .cu_rel_o .c_rel_o .cpp_rel_o .cubin

# Add new SM Versions here as devices with new Compute Capability are released
# SM_VERSIONS := sm_10 sm_11 sm_12 sm_13
SM_VERSIONS := sm_60

CUDA_INSTALL_PATH ?= /usr/local/cuda-10.0
.........

3. Modify cuda/dwt2d/Makefile, cuda/cfd/Makefile, cuda/particlefilter/Makefile, cuda/lavaMD/makefile, cuda/hybridsort/Makefile, and change all the parameters suspected of sm_20 and compute_20 from 20 to 60 (The number here indicates the computing power, which can be viewed through https://developer.nvidia.com/cuda-gpus#compute . Although the capacity of my GeForce 1070 TI is 6.1, sm_60 can also be compiled and passed)

4. Modify the cuda/mummergpu/src/suffix-tree.cpp file and add the header file <unistd.h>

5. Modify the content in the openmp/mummergpu/src directory to https://github.com/rmtheis/mummergpu/tree/master/mummergpu-2.0/src and copy it directly

6. Modify the opencl/cfd/euler3d.cpp file and comment out the if (file==NULL) judgment branch in line 276~278

        ........
        //float* normals;
        {
            std::ifstream file(data_file_name);
            // if(file==NULL){
            //     throw(string("can not find/open file!"));
            // }
            file >> nel;
        .........

4. Copy some files

root@sundata:/data/szc/rodinia_3.1# cp -r /usr/local/cuda-10.0/samples/common/inc/* /root/NVIDIA_GPU_Computing_SDK/OpenCL/common/inc/

root@sundata:/data/szc/rodinia_3.1# cp -r /usr/local/cuda-10.0/samples/common/inc/* /root/NVIDIA_GPU_Computing_SDK/C/common/inc/

Compile

Compile submodule

We have to compile some things manually

1. Under the cuda/kmeans directory

root@sundata:/data/szc/rodinia_3.1/cuda/kmeans# gcc -g -fopenmp -O2  -c kmeans.c

2. Under the cuda/leukocyte/meschach_lib directory

root@sundata:/data/szc/rodinia_3.1/cuda/leukocyte/meschach_lib# cc -c -O -DHAVE_CONFIG_H *.c

3. Under the cuda/leukocyte/CUDA directory

root@sundata:/data/szc/rodinia_3.1/cuda/leukocyte/CUDA# gcc   -g -O3 -Wall -I../meschach_lib *.c -c

4. Under the openmp/leukocyte/OpenMP directory

root@sundata:/data/szc/rodinia_3.1/openmp/leukocyte/OpenMP# cc -c -O -DHAVE_CONFIG_H *.c -I../meschach_lib

5. Under the opencl/leukocyte/OpenCL directory

root@sundata:/data/szc/rodinia_3.1/opencl/leukocyte/OpenCL# cc -c -O -DHAVE_CONFIG_H *.c -I ../../../openmp/leukocyte/meschach_lib -I /usr/local/cuda-10.0/include

Compile the project

6. Finally, go back to the project root directory and make

root@sundata:/data/szc/rodinia_3.1# make
......
make[1]: Leaving directory '/data/szc/rodinia_3.1/opencl/dwt2d'

root@sundata:/data/szc/rodinia_3.1#

As long as no error is reported at the end, it is fine. If there is still an error, you must modify the makefile, source file or manually compile according to the situation.

Error handling

1), if it is similar to the following error

nvcc fatal   : Value 'compute_20' is not defined for option 'gpu-architecture'

It is necessary to modify the makefile file of the corresponding directory, and change the XX_20 inside to XX_60 (the value suitable for your GPU computing power)

2) If it is a code error in the source file, you have to modify the corresponding source file, such as adding header files, commenting code, modifying types, etc.

3) If the XXX.o file cannot be found, then we have to enter the corresponding directory and compile it ourselves. The compilation command can refer to the latest line of compilation command above the error message, but if it is a compiled .h file, it must be changed to a .c file; if we cannot find the header file when compiling, then locate this header file and then The directory contains the compilation command as the -I parameter, similar to this

root@sundata:/data/szc/rodinia_3.1/openmp/leukocyte/OpenMP# cc -c -O -DHAVE_CONFIG_H *.c -I/data/szc/rodinia_3.1/openmp/leukocyte/meschach_lib

View Results

Finally, let's take a look at the results generated in the bin directory

Compilation is now complete

run

Take kmeans as an example, other examples may have problems

1. Enter the data/kmeans directory

root@sundata:/data/szc/rodinia_3.1# cd data/kmeans/

2. Copy a copy of kmeans.cl file

root@sundata:/data/szc/rodinia_3.1/data/kmeans# cp /data/szc/rodinia_3.1/opencl/kmeans/kmeans.cl .

3. Run the test sample

root@sundata:/data/szc/rodinia_3.1/data/kmeans# ../../bin/linux/opencl/kmeans -i 204800.txt
WG size of kernel_swap = 256, WG size of kernel_kmeans = 256

I/O completed

Number of objects: 204800
Number of features: 34
iterated 19 times
Number of Iteration: 1
root@sundata:/data/szc/rodinia_3.1/data/kmeans# ../../bin/linux/cuda/kmeans -i 204800.txt
I/O completed

Number of objects: 204800
Number of features: 34
iterated 2 times
Number of Iteration: 1
root@sundata:/data/szc/rodinia_3.1/data/kmeans# ../../bin/linux/omp/kmeans -i 204800.txt
I/O completed

num of threads = 1
number of Clusters 5
number of Attributes 34

Time for process: 0.491860

Screenshot below

Error handling

If some test samples run incorrectly under bin

Or you can't find the executable file at all (there is no b+tree in the picture below)

A feasible solution is to switch to the model you want to run (such as opencl), enter the application you want to test, and compile or run it. Just run the following opencl/bfs directly

But opencl/b+tree reports an error directly, the memory overflow should be

But the cuda version can run normally, just change sm_20 in Makefile to sm_60 before compiling

root@sundata:/data/szc/rodinia_3.1/cuda/b+tree# vim Makefile

root@sundata:/data/szc/rodinia_3.1/cuda/b+tree# make KERNEL_DIM="-DRD_WG_SIZE_0=256"

The running screenshot is as follows

Conclusion

In short, compiling and running rodinia is a very cumbersome process, during which many problems will be encountered, and we need to find ways to solve them according to the actual situation.

So far, I will share with you the results of my compilation. Many executable files can be run directly:

Link: https://pan.baidu.com/s/1a8x9M_tRN9wqeDU627xkAw
Extraction code: cgs4

If you have any questions, you can discuss them in the comment section, and I will reply as soon as I see them.

Compile and run rodinia_3.1

background

step

Download and unzip

Modify and copy files

Compile

Compile submodule

Compile the project

Error handling

View Results

run

Error handling

Conclusion

Guess you like