yolov8训练

文章目录

0. 环境
1. ultralytics\cfg\datasets\VOC.yaml 数据集配置
2. ultralytics\cfg\models\v8\yolov8.yaml 网络结构配置
3. ultralytics\cfg\default.yaml 训练配置
4. 将voc图像标注转换为yolo格式
5. 训练
6. bug
- 6.1 训练时box_loss和cls_loss都为nan值 && p r map的值都是0

0. 环境

cuda环境
下载NVIDIA显卡驱动

输入设备显卡型号搜索适合的驱动下载安装

安装好后cmd窗口输入nvidia-smi查看可安装的最高版本cuda(右上角版本号）
- （或者win+s 搜索nvidia control panel 控制面板->左上菜单帮助->系统信息->左上角组件->查看3d设置NVCUDA64.dll产品名称对应的cuda版本即为可安装的最高cuda版本）
下载并安装合适版本cuda
下载并安装合适版本cudnn
安装好cuda和cudnn后，可通过nvcc -V或nvcc --version查看并验证cuda的版本
pytorch环境
1. 下载anaconda
  - 清华源
  - 官网
2. 创建虚拟环境
  1. cmd窗口输入conda env list查看现有虚拟环境
  2. 输入conda create -n env_name python=x.x
  3. 激活创建的虚拟环境conda activate env_name
3. 下载并安装合适版本pytorch
  - 在激活后的虚拟环境下conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
4. 验证pytorch的安装是否成功
  - 在虚拟环境的终端依次输入
  - python # 进入python环境
  - import torch # 导入torch包
  - torch.cuda.is_available() # 查看cuda是否可用返回true为可用
  - torch.__version__ # 查看torch版本
yolov8环境
1. 下载源码 git clone https://github.com/ultralytics/ultralytics.git
2. 安装虚拟环境下pip install ultralytics -i https://pypi.douban.com/simple/ （或者cd到E:\code\ultralytics\setup.py代码路径下python setup.py install）
- 可能遇到的报错 ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
- 网上看到的解决办法pip install PyYAML --ignore-installed，但再次安装ultralytics仍然报错
- 后来发现可以先去本地把PyYAML的包先删掉再安装

如果是pip安装删掉D:\Anaconda\Lib\site-packages PyYAML-6.0.1.dist-info
如果是caonda安装删掉D:\Anaconda\pkgs pyyaml-3.13-py37hfa6e2cd_0 还有压缩包pyyaml-3.13-py37hfa6e2cd_0.tar也要删掉

以上下载若不成功请科学上网或者在网上找对应的包安装

也可以再在Users目录下给pip和conda换源后重新install

pip
- 在C:\Users\Administrator用户目录下添加pip文件夹
- 在文件夹下新建pip.ini文件最后加上以下信息

# This file has been autogenerated or modified by NVIDIA PyIndex.
# In case you need to modify your PIP configuration, please be aware that
# some configuration files may have a priority order. Here are the following 
# files that may exists in your machine by order of priority:
#
# [Priority 1] Site level configuration files
#	1. `D:\Anaconda\envs\python38\pip.ini`
#
# [Priority 2] User level configuration files
#	1. `C:\Users\Administrator\AppData\Roaming\pip\pip.ini`
#	2. `C:\Users\Administrator\pip\pip.ini`
#
# [Priority 3] Global level configuration files
#	1. `C:\ProgramData\pip\pip.ini`

[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple/
[install]
trusted-host = mirrors.tsinghua.com
               pypi.ngc.nvidia.com
no-cache-dir = true
extra-index-url =
                  https://pypi.ngc.nvidia.com

conda
- 记事本打开C:\Users\Administrator\.condarc文件
- 在文件最后加上以下信息

default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch-lts: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  deepmodeling: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/

1. ultralytics\cfg\datasets\VOC.yaml 数据集配置

复制一份ultralytics\cfg\datasets\VOC_my.yaml
修改训练、验证、测试数据集路径
修改类别名称，不需要加双引号（注意类别名称要与txt标签对应）

2. ultralytics\cfg\models\v8\yolov8.yaml 网络结构配置

复制一份ultralytics\cfg\models\v8\yolov8l.yaml
修改类别数目nc
修改scales
修改backbone和head里的输入输出参数（默认是yolov8n的）

3. ultralytics\cfg\default.yaml 训练配置

复制一份ultralytics\cfg\default_copy.yaml(作为源码备份)
直接修改ultralytics\cfg\default.yaml 的配置参数
Train settings
- 修改model配置路径为 models\v8\yolov8_my.yaml
- 修改data配置路径为 datasets\VOC_my.yaml
- 修改epochs （整个数据集迭代次数）
- 修改batch数（每批次输入图像数目进行权重更新）
- 修改imgsz （输入训练的图像尺寸）
- 修改device （指定GPU为0 1…或者cpu）
- 修改pretrained （是否使用预训练模型，可指定预训练模型路径）
- 修改optimizer （选择合适的优化器，或默认自动选择）
- single_cls （将多分类数据当单类检测训练）
Val/Test settings
- 修改max_det （一张图中检测的最多目标数）
Prediction settings
- show_labels/hide_labels （是否显示标签）
- show_conf/hide_conf （是否显示置信度）
- line_width （boundingbox的线宽）

4. 将voc图像标注转换为yolo格式

if not isinstance(_file_extension, list): # 判断_file_extension的类型是不是list
os.path.splitext(path) # 分割路径，返回路径名和文件扩展名的元组 ('点之前的路径','.点以及后缀') 
os.path.split(path)	# 把路径分割成 dirname 和 basename，返回一个元组 ('最后一个斜杠以及之前的路径','最后一个斜杠之后的文件名及点后缀')
...

文件的存放形式：直接图片和标注放在同一个文件夹

5. 训练

在项目终端输入以下指令

yolo cfg=ultralytics/cfg/default.yaml
# yolo task=detect mode=train model=E:\code\github\ultralytics\ultralytics\cfg\models\v8\yolov8_my.yaml data=E:\code\github\ultralytics\ultralytics\cfg\datasets\VOC_my.yaml epochs=100 batch=2

win nvidia-smi –l t 其中t为刷新频率的时间
ubuntu watch -n 1 nvidia-smi

6. bug

6.1 训练时box_loss和cls_loss都为nan值 && p r map的值都是0

Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
2/100     0.679G        nan        nan        nan          3       1024: 100%|██████████| 543/543 [03:39<00:00,  2.47it/s]
            Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 272/272 [00:43<00:00,  6.31it/s]
              all       1086       4304          0          0          0          0

\ultralytics\ultralytics\cfg\default.yaml
- amp 设置为False amp: False # (bool) Automatic Mixed Precision (AMP) training, choices=[True, False], True runs AMP check
- 修改后loss出现数值了，但是p r map等参数仍然都还是0；
看网上的评论可能与显卡和cuda版本有关，换个环境训练后就没有这个问题了
- 原环境win,1650,cuda11.6,pytorch1.12.1；
- 现环境Ubuntu20.04,cuda11.6,pytorch1.12.1

文章目录

0. 环境

1. ultralytics\cfg\datasets\VOC.yaml 数据集配置

2. ultralytics\cfg\models\v8\yolov8.yaml 网络结构配置

3. ultralytics\cfg\default.yaml 训练配置

4. 将voc图像标注转换为yolo格式

5. 训练

6. bug

6.1 训练时box_loss和cls_loss都为nan值 && p r map的值都是0

猜你喜欢