Tensorflow2.2 object detection API win10 installation pit-to solve the problem of running errors under multiple CUDA installations

Tensorflow2.2 object detection API

Unknown errors appearing in the tensorflow2.2 object detection API is the most pitted point. It took nearly 8 hours to find the problem. Let's take a look at my mining pit record slowly. You can run this demo step by step.

The installation process can follow the points that need to be noted here
, I have marked them in yellow:

  • All operations and installations are operated in a virtual environment. My virtual environment is tensorflow, and tf2.2.0 is installed in this environment.
    At the beginning, I installed it blindly, installed tf1.5.0 in the base environment, and then ran the code in the tensorflow environment, stepped on a lot of pits, and tried to install in one environment

First, we open the anaconda prompt

activate tf2

(tf) C:\Users\LR18813040244>
  • (6) 安装Tensorflow object detection API
# From within TensorFlow/models/research/
cp object_detection/packages/tf2/setup.py .
python -m pip install .

Win10 does not have a cp command. If you run the above two sentences, an error will be reported. You should jump to the directory of setup.py to perform the installation.

cd E:\model_examples\my_models\research\object_detection\packages\tf2
python setup.py install

Some dependencies can’t be downloaded. I stayed there all night and the next day. I was puzzled. Then I downloaded it manually.

  • (7) Check whether the installation is successful (it is fine if the operation is unsuccessful)
# From within TensorFlow/models/research/
python object_detection/builders/model_builder_tf2_test.py

I was unsuccessful here, because an error was reported that I was missing the official module, but the official module required tf >=2.4.0. I saw it download tf2.4.0 for me and I immediately checked it, because it will uninstall tf 2.2.0 for you Yes, so CUDA and cudnn can't be used anymore, the matching relationship given by the official website is up to tf2.3.0

  • 2. Test your own case
    There is a very pit here. Running the official website code is likely to fail. Here is a detailed description of the steps, which can be completely based on what I wrote here, no need to look at the above part.

1. Open the anaconda prompt to activate the tensorflow environment

2. Locate the directory where object_detection_tutorial.ipynb is located

Insert picture description here

3. Open with jupyter notebook

(tensorflow) D:\model_examples\models\research\object_detection\colab_tutorials>jupyter notebook

4. No need to run all, start running from here

5. Error No module named'object_detection', solution: add
! pip install tensorflow-object-detection-api before importing the module , as follows

!pip install tensorflow-object-detection-api
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

5. If the service is down, it will be restarted immediately. Changing those relative paths in the code to absolute paths will solve the problem.

6. If the last cell operation
Insert picture description here
reports Unknown errors, the probability is that CUDA and cudnn do not match. Look at what is prompted on the anaconda prompt. If there is
Loaded runtime CuDNN library: 7.5.1 but source was compiled with: 7.6.5
or
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED basically means that CUDA and cudnn do not match. In fact, what I installed fully meets the requirements. I installed CUDA10.1 and cudnn7.6.5, but the problem is that I installed two CUDAs. I installed CUDA v10.0 before, so the environment variables related to v10.0 are there. Above, now I want to switch CUDA to v10.1, and let the computer use CUDA10.1, then I need to move the environment variables of v10.1 to the top. For detailed operation, please look here

Insert picture description here
In this way, CUDA and cudnn can match each other, and the program can run.

7. The masking_model.output_shapes line of the instance segmentation part will report an error. Change to the following code to solve the problem

# masking_model.output_shapes
masking_model.signatures['serving_default'].output_shapes

Guess you like

Origin blog.csdn.net/weixin_44823313/article/details/113115245