Project development tutorials and common problems and solutions

Project development tutorials and common problems and solutions

Table of contents

Table of contents

Project development tutorials and common problems and solutions

1. Building a Python development environment

1. Install cuda cudnn (required for getting started with deep learning)

(1) Windows installation method

(2) Ubuntu18.04 installation method

2. Install Python (Anaconda is recommended)

(1) How to install python on Windows

(2) How to install python on Ubuntu 18.04

(3) Python development tool (IDE): Pycharm is strongly recommended

3. Pytorch installation (required for getting started with deep learning)

4. Install project dependencies

5. Run the project Demo

6. Visualization tool: how to use TensorBoard

2. Common problems and solutions

(1) No module named problem

(2) The directory path is filled in incorrectly

(3) Chinese path problem (special attention)

(4) Picture format problem

(5) OSError: The paging file is too small to complete the operation


This is a basic tutorial for the development of related projects related to AI eating big melon. For any source code downloaded from the public account [AI eating big melon], you can refer to this blog post to build a development environment 

Please refer to the project installation tutorial ( for beginners, please read the following tutorial first and configure the development environment ):

[Respect originality, please indicate the source for reprinting] https://blog.csdn.net/guyuealian/article/details/129163343 


1. Building a Python development environment

If the project involves deep learning, such as Pytorch, TensorFlow and other deep learning frameworks, then the GPU development environment must be configured

1. Install cuda cudnn (required for getting started with deep learning)

The algorithm of the deep learning model is relatively complicated. If the CPU is used for calculation, the speed will be very slow, so the GPU is needed for parallel computing acceleration. Deep learning frameworks, such as Pytorch and TensorFlow, all support GPU training. The use of GPU devices requires the support of graphics cards, such as common 1080 graphics cards, 2070 graphics cards, etc., and the corresponding graphics card drivers, as well as CUDA and cuDNN libraries, need to be installed. CUDA is a parallel computing platform and programming model invented by NVIDIA that dramatically increases computing performance by harnessing the processing power of graphics processing units (GPUs). The cuDNN (CUDA Deep Neural Network library) is an acceleration library  for deep neural networks built by NVIDIA , and it is a GPU acceleration library for deep neural networks.

(1) Windows installation method

(2) Ubuntu18.04 installation method

  • Refer to the installation tutorial: Install cuda and cudnn on ubuntu18.04
  • After installing the NVIDIA graphics card, you can view the highest supported version of cuda through nvidia-smi . When downloading the CUDA installation package, only the version number is less than or equal to the version number.

 

# 安装NVIDIA显卡:

# 安装cuda:注意Driver去掉[x],不要安装Driver
sudo sh cuda_12.0.0_525.60.13_linux.run

# 在~/.bashrc添加环境变量
export PATH=/usr/local/cuda-12.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-12.0

# 激活环境变量后,可输入:nvdia-smi验证显卡使用情况
source ~/.bashrc

# 安装cudnn:下载对应cuda版本的cudnn,然后解压,拷贝cudnn文件
sudo cp include/cudnn*.h /usr/local/cuda-12.0/include
sudo cp lib/libcudnn*  /usr/local/cuda-12.0/lib64/
sudo chmod a+r  /usr/local/cuda-12.0/include/cudnn.h

2. Install Python (Anaconda is recommended)

(1) How to install python on Windows

(2) How to install python on Ubuntu 18.04

(3) Python development tool (IDE): Pycharm is strongly recommended

PyCharm is a Python IDE built by JetBrains, which can support debugging, syntax highlighting, Project management, code jumping, smart prompts, auto-completion, unit testing, version control, etc.


3. Pytorch installation (required for getting started with deep learning)

PyTorch is an open source Python machine learning library based on Torch

Please choose your own version to install, for example, if you install cuda=11.0, then install the corresponding version torch

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

37d0a7770a3f4276b9ff4990fcfaffe6.png

4. Install project dependencies

The python dependency package of the project, the installation method of Windows and Ubuntu is the same; the general project comes with a requirements.txt file, which contains the corresponding version number of the python dependency package required for project development, such as the first dependency package numpy== in the figure below 1.18.5, indicating that the project uses the numpy library, the corresponding version is 1.18.5, you can choose to use pip to install the corresponding version:

pip install numpy==1.18.5

# or

pip install numpy==1.18.5 -i https://pypi.tuna.tsinghua.edu.cn/simple

The URL after -i indicates the download address of the installation package. Domestic pip installation speed is slow. You can use -i to specify the mirror source to speed up the installation speed.

Other installation packages can also be installed one by one with pip, or directly install all dependent packages:

pip install -r requirements.txt

PS: Generally, dependent packages are backward compatible, you only need to install a version that is greater than or equal to the version number of requirements 

9c9a8b8f15944f4bafb479aee6b16110.png

5. Run the project Demo

The demo.py file that comes with the project generally supports the command line input of argparse, and the command line can be run in the terminal (Terminal).

838f8cbedf1c440e9ea164293f2e6dc1.png

The project will generally provide a Linux running script (bash), as shown in the figure below, if you are developing on Linux, you can directly copy and paste it to the terminal to run ( note that you need to cd into the project root directory, otherwise you cannot find the file )

# 测试图片
image_dir='data/test_image' # 测试图片的目录
weights="data/model/yolov5s_640/weights/best.pt" # 模型文件
out_dir="runs/test-result" # 保存检测结果
python demo.py --image_dir $image_dir --weights $weights --out_dir $out_dir
  1. In the bash script above, $image_dir, $weights and $out_dir all take the values ​​of image_dir, weights and out_dir, which is the same as the python variable copy statement.
  2. But the Windows system does not support this variable assignment syntax! ! ! !
  3. If your Windows system has installed git, you can directly enter the above command in the git terminal without modification

 If you are developing on Windows, please remove the variable and modify it to:

# 这种命令在Linux和Windows终端都支持,但语句比较长就不太美观了
python demo.py --image_dir 'data/test_image'  --weights "data/model/yolov5s_640/weights/best.pt" --out_dir "runs/test-result"

9a81be39c51a47588f4d4a8bbba03e5d.png


6. Visualization tool: how to use TensorBoard

  • Installation: Use pip to install dependent packages: tensorboard==2.5.0 and tensorboardX==2.1
  • How to use: Enter the command in the terminal (Terminal):
# 需要安装tensorboard==2.5.0和tensorboardX==2.1
# 基本方法
tensorboard --logdir=path/to/log/
# 例如
tensorboard --logdir=work_space/mobilenet_v2_1.0_CrossEntropyLoss_20230228174645/log

66e041dc0f894c77a1bfdc0d268b1919.png

  •  You can see the visualization of the training process:

ade7a1de66f248ea8023d8d777389306.png


2. Common problems and solutions

(1) No module named problem

  • If there is an error of "No module named ***", please use pip install ***, for example, the following error occurs

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'basetrainer'

Please use pip to install:

pip install basetrainer -i https://pypi.tuna.tsinghua.edu.cn/simple

The URL after -i indicates the download address of the installation package. Domestic pip installation speed is slow. You can use -i to specify the mirror source to speed up the installation speed.

(2) The directory path is filled in incorrectly

Beginners in Windows development, often make mistakes! ! ! ! When filling in the path, I don't know the path separator, pay attention to the path separator of Windows and Linux

Windows path separator: [\] or [//], some libraries in python can also use [/], so how to write it? In most cases, use the [/] separator! ! !

Linux (Ubuntu) path separation symbol: only [/]

(3) Chinese path problem (special attention)

In the Windows environment, there should be no Chinese path in the project file and data directory, otherwise there will be problems such as opencv reading pictures abnormally

(4) Picture format problem

By default, the project only supports two formats of *.jpg *.png. If your picture is bmp, tif or other pictures, please add the corresponding picture format in the postfix parameter.

For example, the setting function passes in the parameter postfix=["*.jpg", "*.png"] ,

If you don't know where to modify it, you can find it by searching postfix

a7240f9bdc1b42eface232ea1f51d7a3.png

(5) OSError: The paging file is too small to complete the operation

When your computer configuration is too poor, some errors may occur during training;

ee8bfb05b40f40c8ae0bbb0a3b8dd5a2.png

Description of the problem:  This is because your computer performance is too poor and the memory is insufficient

Solution: Modify the training parameters workers and batch-size, for example, set --batch-size=8 --workers=0

batch_size: 8

num_workers: 0

Guess you like

Origin blog.csdn.net/guyuealian/article/details/129163343