【yolox训练过程中遇到的问题集合】

这里写目录标题

深度学习遇到的一系列bug

深度学习遇到的一系列bug

VScode无法激活conda

打开vscode,选择左上角的文件—首选项—设置
点击右上角的小图标
在这里插入图片描述
进入setting.json后，添加一行代码，重启VScode终端即可成功激活conda

“terminal.integrated.defaultProfile.windows”: “Command Prompt”

1.vscode加载web 视图报错

Error: Could not register serviceworkers: InvalidstateError: Failed to regist

解决方法
关闭vscode，win+R，输入cmd，输入指令

code --no-sandbox 即可顺利解决

2.CUDA out of memory

CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 8.00 GiB total capacity; 19.13 GiB already allocated; 0 bytes free; 19.15 GiB reserved in total by PyTorch)

解决方法
可能是在训练过程中将batch_size设置过大，导致内存不足，减少batch_size数字即可

3.voc2007数据集中的txt文件

train.txt 是训练图片文件的文件名列表（训练集）
val.txt是验证的图片文件的文件名列表（验证集）
trianval.txt是训练和验证的图片文件的文件名列表
test.txt 是测试的图片文件的文件名列表（测试集）
train是网络模型在训练时所使用的文件名，而val是网络模型在训练过程中进行测试时使用的文件名。val不影响模型训练，在训练的时候可以得到train和val这两个数据集的误差率，利用这个误差率绘制学习曲线，观察学习曲线，可以发现一些网络模型的问题，根据这些问题去调整网络参数。test是网络模型训练完进行测试。

4.object has no attribute ‘cache‘

将yolox/data/datasets/voc.py下的190行左右

@cache_read_img
def read_img(self, index, use_cache=True):

修改为

@cache_read_img(use_cache=True)
def read_img(self, index):

将yolox/data/datasets/voc.py 文件的

(self._imgpath % self.ids[i]).split(self.root + “/”)
(self._imgpath % self .ids[i]).split(self.root + “\\”)

5.KeyError:‘model’

找不到权重文件，进行权重文件（.pth）的更换即可
在这里插入图片描述

6.No module named loguru

激活环境，输入

pip install loguru -i https://pypi.tuna.tsinghua.edu.cn/simple

7.Python AttributeError: module ‘distutils‘ has no attribute ‘version‘

这里不建议进行torch版本的升级，很可能会导致torch升级后和环境中的其它包不在版本匹配，并且默认升级的命令升级的为CPU版本的torch
**解决方法：**激活anaconda所配置的虚拟环境后，输入

pip install setuptools==59.5.0

将版本进行固定，即可成功解决此bug

8.No module named ‘scipy’

pip install scipy

9.anaconda配置h5py===2.10.0

conda uninstall h5py
conda install h5py==2.10.0

后续遇到问题还会进行更新