Table of contents
AttributeError: module 'distutils' has no attribute 'version'
RuntimeError: Distributed package doesn’t have NCCL built in|PyTorch踩坑
train.py:AttributeError:"DataContainer" object has no attribute 'type'
AttributeError: module 'distutils' has no attribute 'version'
This error will be reported when starting to prepare to train the model:
Solve: "setuptools version problem", problems caused by too high version; setuptools version
Recommended installation: setuptools 57.5.0
pip uninstall setuptools
pip install setuptools==57.5.0 //需要比你之前的低
Note: The following are errors that will occur under Windows
RuntimeError: Distributed package doesn’t have NCCL built in|PyTorch踩坑
When reproducing the lane line detection GANet network on the Windows system, the following error occurred
raise RuntimeError("Distributed package doesn’t have NCCL "
RuntimeError: Distributed package doesn’t have NCCL built in
Reason: Windows does not support NCCL and should be changed to gloo
Solution: Add a piece of code under prefix_store = PrefixStore(group_name, store) in the code distributed_c10d.py:
backend = "gloo"
The modified snippet is as follows:
prefix_store = PrefixStore(group_name, store)
backend = "gloo"
if backend == Backend.GLOO:
pg = ProcessGroupGloo(
prefix_store,
rank,
world_size,
timeout=timeout)
_pg_map[pg] = (Backend.GLOO, store)
_pg_names[pg] = group_name
elif backend == Backend.NCCL:
if not is_nccl_available():
raise RuntimeError("Distributed package doesn't have NCCL "
"built in")
There are other methods on the Internet that add backend='gloo' to this file, but I solved it using the above method.
"Import torch" under Windows reports an error: "OSError: [WinError 1455] The page file is too small and the operation cannot be completed."
Solution: Find num_workers in mmdet\datasets\builder.py and assign it to 0
"""
rank, world_size = get_dist_info()
if dist:
# DistributedGroupSampler will definitely shuffle the data to satisfy
# that images on each GPU are in the same group
if shuffle:
sampler = DistributedGroupSampler(dataset, samples_per_gpu,
world_size, rank)
else:
sampler = DistributedSampler(
dataset, world_size, rank, shuffle=False)
batch_size = samples_per_gpu
num_workers = workers_per_gpu
else:
sampler = GroupSampler(dataset, samples_per_gpu) if shuffle else None
batch_size = num_gpus * samples_per_gpu
# num_workers = num_gpus * workers_per_gpu
num_workers = 0
init_fn = partial(
worker_init_fn, num_workers=num_workers, rank=rank,
seed=seed) if seed is not None else None
If that doesn't work, you may need to adjust the page file size.
train.py:AttributeError:"DataContainer" object has no attribute 'type'
This problem confused me for two days. My cuda version is 11.2, so torch1.6.0 cannot be used. After trying torch1.8.0 to 1.10, it still doesn't work. Finally, I tried torch1.7.0+cuda11.0 and finally successfully ran through it. !
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DINGZH~1\\AppData\\Local\\Temp\\tmpmn2vq6tw\\tmpxi4eg_tv.py'
Traceback (most recent call last):
File "e:/codingprogram/project/GANet/lane_application/lane_detection.py", line 28, in <module>
model,data_loader,show_dst,args = load_model(config_file, checkpoint_file, device=device)
File "e:/codingprogram/project/GANet/lane_application/lane_detection.py", line 14, in load_model
model,data_loader,show_dst,args = load(image_path,save_path)
File "e:\codingprogram\project\ganet\tools\ganet\culane\test_dataset.py", line 309, in load
cfg = mmcv.Config.fromfile(args.config)
File "D:\software\Anaconda\envs\ganet\lib\site-packages\mmcv\utils\config.py", line 165, in fromfile
cfg_dict, cfg_text = Config._file2dict(filename)
File "D:\software\Anaconda\envs\ganet\lib\site-packages\mmcv\utils\config.py", line 92, in _file2dict
osp.join(temp_config_dir, temp_config_name))
File "D:\software\Anaconda\envs\ganet\lib\shutil.py", line 121, in copyfile
with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DINGZH~1\\AppData\\Local\\Temp\\tmpmn2vq6tw\\tmpxi4eg_tv.py'
solution:
Simply put, just replace the next line of code in config.py