Paddle and AFFormer environment configuration

This time, I will re-record the installation process of paddle, mainly because the correct environment installation was not performed when the server environment was initialized.

basic environment

insert image description here

Cloud disk deployment

insert image description here

conda install

Anaconda installation
The first is to download the relevant package command:

 sudo wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

Execute after the download is complete

bash Anaconda3-2020.02-Linux-x86_64.sh

Install Anaconda
Enter the installation program and prompt to enter "ENTER" to continue (Please, press ENTER to continue): as shown below

Then enter all the time, which is actually an underline operation. What appears here are the license regulations, until the final prompt whether to accept, just enter yes

insert image description here

Then it prompts that Anaconda3 will now be installed into this location:
/root/anaconda3, that is, it is installed in this location by default. We can choose the default and press Enter directly

insert image description here

Then it started to install, and then prompted whether it needs to be initialized, just enter yes

insert image description here

Finally, the installation was successful, and finally executed

source ~/.bashrc

We can then create the environment as before, activate the environment, and install the appropriate packages.

paddle installation

Create a conda environment

conda create  -n paddle python=3.7
conda activate paddle

pytorch install

pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

paddle installation

python -m pip install paddlepaddle-gpu==2.4.2 -f https://www.paddlepaddle.org.cn/whl/windows/mkl/avx/stable.html

Other dependent package installation

pip install -r requirements.txt --index-url https://pypi.douban.com/simple

Error:

ImportError: libcudart.so.10.2: cannot open shared object file: No
such file or directory

Solution:
Download the file from somewhere else and upload it

insert image description here
Then run the following command

sudo ldconfig /usr/local/cuda-11.2/lib64

The configuration of the Paddle environment itself is not difficult, but a series of errors occurred due to problems with the previous pre-installed system.
Training process:

insert image description here
NOTE: Be sure to match this file.

insert image description here

ArtFormer environment configuration

Environment: CUDA11.2

Install pytorch

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

Install MMCV

MMCV installation

pip install mmcv==2.0.0rc4 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10/index.html

insert image description here

Install MMCV-full

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html

insert image description here

other dependencies

pip install timm
pip install opencv-python
pip install einops

report error

Error 1:

ModuleNotFoundError: No module named ‘mmseg‘

Solution 1:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple mmsegmentation

But this still throws an error:

from mmseg.apis import init_random_seed, set_random_seed,
train_segmentor ImportError: cannot import name ‘init_random_seed’
from ‘mmseg.apis’

In fact, the version is wrong, to solve:

pip install mmsegmentation==0.30.0

Error 2:

File “/home/ubuntu/anaconda3/envs/afformer/lib/python3.7/site-packages/mmseg/init.py”,line 62, in
f’MMCV=={mmcv.version} is used but incompatible. ’ \ AssertionError: MMCV==1.7.1 is used but incompatible. Please install mmcv>=2.0.0rc4.

Solution 2: Modify the /home/ubuntu/anaconda3/envs/afformer/lib/python3.7/site-packages/mmseg/init.py file :

insert image description here
Error 3:

AttributeError: module 'torch.nn' has no attribute 'SiLU'

Solution 3: The method is missing, usually due to the torch version, change the torch version decisively

run

Switch to tools and execute python train.py
to report an error:

raise type(e)(f'{
      
      obj_cls.__name__}: {
      
      e}')
KeyError: "EncoderDecoder: 'CLS is not in the models registry'"

This is because there is already a mmseg in the source code, and one of our installation packages has the same name, just delete the installation package.
Then run:

python setup.py develop

The final installed environment is:

insert image description here

Guess you like

Origin blog.csdn.net/pengxiang1998/article/details/131095575