Solutions to problems such as THC.h: No such file or directory and THCCudaMalloc not defined caused by pytorch version mismatch

In the process of installing maskrcnn-benchmark dependencies in INSTALL.md, I encountered a problem that the pytorch version did not match and could not be installed. Most of the existing content suggested installing a lower version of pytorch to solve the problem, but this cannot always be the case Do it, otherwise my compatibility is too bad, and by the way, I will also complain about the compatibility of pytorch. Here is a summary for reference when encountering similar problems.

The main problem encountered is that many THC-related packages have been abandoned during the pytorch update process, resulting in mismatches in many .cu files when installing dependencies.

The ubuntu-16.04.1 operating system is used in this article, the pytorch version is 1.13.1, and the cuda version is 11.6. If there are more modifications after the pytorch update, please refer to other articles.

Q1 fatal error: THC/THC.h: No such file or directory

At the beginning, I first encountered the problem that the header file could not be referenced. This problem refers to the blog (138 messages) fatal error: THC/THC.h: No such file or directory_thc/thc.h: There is no such file or directory_o0stinger0o's blog -The content of the CSDN blog , the time is between March and April 22, and the updated content in the github code mentioned in the blog. As you can see, the header file is deleted in the code:

#include <THC/THC.h>

and put all

 THCudaCheck(cudaGetLastError());

replaced by

AT_CUDA_CHECK(cudaGetLastError());

Q2 "THCCeilDiv" is undefined

Later, I found this problem, and after some investigation, I found that pytorch did not define this function, so it needs to be replaced. Based on this idea, check the information and find the debugging (pit stepping) process record of Faster RCNN pytorch version 1.0 | Code Nongjia (codenong.com) such a piece of code:

//dim3 grid(std::min(THCCeilDiv(**, 512L), 4096L));
  dim3 grid(std::min(((int)** + 512 -1) / 512, 4096));

It can be replaced in this way: that is, for each place where THCCeilDiv(x,y) is called in each .cu file, replace this function with the form of (x+y-1)/y to complete the transformation.

Q3 THCudaMalloc、THCudaFree THCState undefined

These three problems are essentially one problem, that is, during the update process, pytorch no longer needs malloc and free, and naturally does not need state to help apply for space. Here is a reference to the release of PyTorch 1.11, which brings two new libraries, TorchData and functorch_Support_linalg_Tensor (sohu.com) and (138 messages) The project environment is upgraded from pytorch1.10 to 1.11 Things to be changed in the middle_pytorch1.11 and 1.10 _小风风_hi's blog-CSDN blog has two documents, and 2 of the latter document successfully helped me solve the problem of Q3. In fact, the library is abolished, and the functions used need to be changed.

First add the header file to the file that uses the Malloc and Free functions

(It may be necessary to replace THCThrustAllocator.cuh with this file or directly include)

#include <ATen/cuda/ThrustAllocator.h>

There are three related statements, which are to construct a state with THCState, then input the state into the Malloc function to generate space, and then release the above space through free.

Here we comment out the sentence THCState, because this data type does not need to be used in the new pytorch to generate the space required by Malloc.

THCState *state = at::globalContext().lazyInitCUDA(); // TODO replace with getTHCState

Then we modify THCudaMalloc as follows (the second parameter of the old function is the only parameter of the new function)

//mask_dev = (unsigned long long*) THCudaMalloc(state, boxes_num * col_blocks * sizeof(unsigned long long));
mask_dev = (unsigned long long*) c10::cuda::CUDACachingAllocator::raw_alloc(boxes_num * col_blocks * sizeof(unsigned long long));

The replacement of THCudaFree is as follows:

// THCudaFree(state, mask_dev); 
c10::cuda::CUDACachingAllocator::raw_delete(mask_dev);

Note that the ThrustAllocator.h header file must be included at the front, otherwise an error will be reported!

So far, all the above problems have been solved, and the problem can be solved without lowering the pytorch version. What I need also compiles successfully.

For THC itself, I also learned about it during this process. You can refer to (138 messages) PyTorch source code analysis (2): THC_Shao Zhengdao's blog-CSDN blog_thcstate . But this is also the past tense, just understand it. We still have to look forward.

If there is any new relevant content in the future, please update it.

Guess you like

Origin blog.csdn.net/code_zhao/article/details/129172817