Summary of solutions to recurring errors in lane line detection GANet network

Table of contents

AttributeError: module 'distutils' has no attribute 'version'

RuntimeError: Distributed package doesn’t have NCCL built in|PyTorch踩坑

"Import torch" under Windows reports an error: "OSError: [WinError 1455] The page file is too small and the operation cannot be completed."

train.py:AttributeError:"DataContainer" object has no attribute 'type' 

PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DINGZH~1\\AppData\\Local\\Temp\\tmpmn2vq6tw\\tmpxi4eg_tv.py'


AttributeError: module 'distutils' has no attribute 'version'

This error will be reported when starting to prepare to train the model:

Solve:  "setuptools version problem", problems caused by too high version; setuptools version

Recommended installation: setuptools 57.5.0

pip uninstall setuptools
pip install setuptools==57.5.0 //需要比你之前的低 

Note: The following are errors that will occur under Windows

RuntimeError: Distributed package doesn’t have NCCL built in|PyTorch踩坑

When reproducing the lane line detection GANet network on the Windows system, the following error occurred

raise RuntimeError("Distributed package doesn’t have NCCL "
RuntimeError: Distributed package doesn’t have NCCL built in

Reason: Windows does not support NCCL and should be changed to gloo

Solution: Add a piece of code under prefix_store = PrefixStore(group_name, store) in the code distributed_c10d.py:

backend = "gloo"

The modified snippet is as follows:

prefix_store = PrefixStore(group_name, store)
backend = "gloo"
if backend == Backend.GLOO:
    pg = ProcessGroupGloo(
        prefix_store,
        rank,
        world_size,
        timeout=timeout)
    _pg_map[pg] = (Backend.GLOO, store)
    _pg_names[pg] = group_name
elif backend == Backend.NCCL:
    if not is_nccl_available():
        raise RuntimeError("Distributed package doesn't have NCCL "
                           "built in")

There are other methods on the Internet that add backend='gloo' to this file, but I solved it using the above method.

"Import torch" under Windows reports an error: "OSError: [WinError 1455] The page file is too small and the operation cannot be completed."

Solution: Find num_workers in mmdet\datasets\builder.py and assign it to 0

    """
    rank, world_size = get_dist_info()
    if dist:
        # DistributedGroupSampler will definitely shuffle the data to satisfy
        # that images on each GPU are in the same group
        if shuffle:
            sampler = DistributedGroupSampler(dataset, samples_per_gpu,
                                              world_size, rank)
        else:
            sampler = DistributedSampler(
                dataset, world_size, rank, shuffle=False)
        batch_size = samples_per_gpu
        num_workers = workers_per_gpu
    else:
        sampler = GroupSampler(dataset, samples_per_gpu) if shuffle else None
        batch_size = num_gpus * samples_per_gpu
        
        # num_workers = num_gpus * workers_per_gpu
    num_workers = 0

    init_fn = partial(
        worker_init_fn, num_workers=num_workers, rank=rank,
        seed=seed) if seed is not None else None

If that doesn't work, you may need to adjust the page file size.

Reference: Solve the problem in pycharm: OSError: [WinError 1455] The page file is too small and the operation cannot be completed - Program matters - Blog Park (cnblogs.com)

train.py:AttributeError:"DataContainer" object has no attribute 'type' 

This problem confused me for two days. My cuda version is 11.2, so torch1.6.0 cannot be used. After trying torch1.8.0 to 1.10, it still doesn't work. Finally, I tried torch1.7.0+cuda11.0 and finally successfully ran through it. !

PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DINGZH~1\\AppData\\Local\\Temp\\tmpmn2vq6tw\\tmpxi4eg_tv.py'

Traceback (most recent call last):
  File "e:/codingprogram/project/GANet/lane_application/lane_detection.py", line 28, in <module>
    model,data_loader,show_dst,args = load_model(config_file, checkpoint_file, device=device)
  File "e:/codingprogram/project/GANet/lane_application/lane_detection.py", line 14, in load_model
    model,data_loader,show_dst,args = load(image_path,save_path)
  File "e:\codingprogram\project\ganet\tools\ganet\culane\test_dataset.py", line 309, in load
    cfg = mmcv.Config.fromfile(args.config)
  File "D:\software\Anaconda\envs\ganet\lib\site-packages\mmcv\utils\config.py", line 165, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename)
  File "D:\software\Anaconda\envs\ganet\lib\site-packages\mmcv\utils\config.py", line 92, in _file2dict
    osp.join(temp_config_dir, temp_config_name))
  File "D:\software\Anaconda\envs\ganet\lib\shutil.py", line 121, in copyfile
    with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DINGZH~1\\AppData\\Local\\Temp\\tmpmn2vq6tw\\tmpxi4eg_tv.py'

solution:

mmcv _file2dict shutil.copyfile PermissionError: [Errno 13] Permission denied · Issue #2926 · open-mmlab/mmdetection (github.com)

Simply put, just replace the next line of code in config.py

Guess you like

Origin blog.csdn.net/zhaodongdz/article/details/128368386