Real-time Scene Text Detection with Differentiable Binarization 问题记录

Official: https: //github.com/MhLiao/DB
Zhou Jun Great God realization: https: //github.com/WenmuZhou/DBNet.pytorch

1. Official

According to the official installation process is very easy to install, just my environment is ubuntu16.04 + cuda8, ​​it has been used pytorch1.0.1 (py3.7) of. Can also run up, but the trained models predicted full reasoning is empty ah, txt are all empty, visualize folder picture no gray box. loss does not converge

[INFO] [2020-01-18 16:24:09,584] step:   1340, epoch:   0, loss: 4.332346, lr: 0.007000
[INFO] [2020-01-18 16:24:09,585] bce_loss: 0.568492
[INFO] [2020-01-18 16:24:09,585] thresh_loss: 0.563487
[INFO] [2020-01-18 16:24:09,586] l1_loss: 0.092640
[INFO] [2020-01-18 16:24:19,117] step:   1360, epoch:   0, loss: 4.255758, lr: 0.007000
[INFO] [2020-01-18 16:24:19,120] bce_loss: 0.544069
[INFO] [2020-01-18 16:24:19,122] thresh_loss: 0.539020
[INFO] [2020-01-18 16:24:19,124] l1_loss: 0.099640
[INFO] [2020-01-18 16:24:28,766] step:   1380, epoch:   0, loss: 4.507674, lr: 0.007000
[INFO] [2020-01-18 16:24:28,767] bce_loss: 0.560643
[INFO] [2020-01-18 16:24:28,768] thresh_loss: 0.652172
[INFO] [2020-01-18 16:24:28,768] l1_loss: 0.105229

With ic15 training data set, too, I do not know where the problem lies. Look at the back

2. Unofficial

Installation process, the installation will fall, can really run, but initially displayed, DBNet.pytorch INFO: train with device cpu and pytorch 1.3.0
because my computer does not have 1.3 cuda10 need, so ran the cpu. very slow.
Later it was seen in the group with pytorch1.1.0 compiled version, but he is cuda10. I also installed 1.1.0 version, then all kinds of training being given ah, helpless. . . He was later abandoned, and later re-fiddle.
In this process, I feel more and more anconda well, in a virtual environment, knock conda list can display the version of each library installed.

_libgcc_mutex             0.1                        main  
absl-py                   0.9.0                     <pip>
anyconfig                 0.9.10                    <pip>
backcall                  0.1.0                    py36_0  
blas                      1.0                         mkl  
ca-certificates           2019.11.27                    0  
cachetools                4.0.0                     <pip>
certifi                   2019.11.28               py36_0  
cffi                      1.13.2           py36h2e261b9_0  
chardet                   3.0.4                     <pip>
cudatoolkit               8.0                           3  
cycler                    0.10.0                    <pip>
decorator                 4.4.1                      py_0  
freetype                  2.9.1                h8a8886c_1  
future                    0.18.2                    <pip>
google-auth               1.10.1                    <pip>
google-auth-oauthlib      0.4.1                     <pip>
grpcio                    1.26.0                    <pip>
idna                      2.8                       <pip>
imageio                   2.6.1                     <pip>
imgaug                    0.3.0                     <pip>
intel-openmp              2019.4                      243  
ipython                   7.11.1           py36h39e3cac_0  
ipython_genutils          0.2.0                    py36_0  
jedi                      0.15.2                   py36_0  
jpeg                      9b                   h024ee3a_2  
kiwisolver                1.1.0                     <pip>
ld_impl_linux-64          2.33.1               h53a641e_7  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.2.1                hd88cf55_4  
libgcc-ng                 9.1.0                hdf63c60_0  
libgfortran-ng            7.3.0                hdf63c60_0  
libpng                    1.6.37               hbc83047_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
libtiff                   4.1.0                h2733197_0  
Markdown                  3.1.1                     <pip>
matplotlib                3.1.2                     <pip>
mkl                       2019.4                      243  
mkl-service               2.3.0            py36he904b0f_0  
mkl_fft                   1.0.15           py36ha843d7b_0  
mkl_random                1.1.0            py36hd6b4f25_0  
natsort                   7.0.0                     <pip>
ncurses                   6.1                  he6710b0_1  
networkx                  2.4                       <pip>
ninja                     1.9.0            py36hfd86e86_0  
numpy                     1.18.1           py36h4f9e942_0  
numpy                     1.17.4                    <pip>
numpy-base                1.18.1           py36hde5b4d6_0  
oauthlib                  3.1.0                     <pip>
olefile                   0.46                       py_0  
opencv-python             4.1.2.30                  <pip>
opencv-python-headless    4.1.2.30                  <pip>
openssl                   1.1.1d               h7b6447c_3  
parso                     0.5.2                      py_0  
pexpect                   4.7.0                    py36_0  
pickleshare               0.7.5                    py36_0  
Pillow                    6.2.2                     <pip>
pillow                    7.0.0            py36hb39fc2d_0  
pip                       19.3.1                   py36_0  
Polygon3                  3.0.8                     <pip>
prompt_toolkit            3.0.2                      py_0  
protobuf                  3.11.2                    <pip>
ptyprocess                0.6.0                    py36_0  
pyasn1                    0.4.8                     <pip>
pyasn1-modules            0.2.8                     <pip>
pyclipper                 1.1.0.post3               <pip>
pycparser                 2.19                       py_0  
pygments                  2.5.2                      py_0  
pyparsing                 2.4.6                     <pip>
python                    3.6.10               h0371630_0  
python-dateutil           2.8.1                     <pip>
pytorch                   1.0.1           py3.6_cuda8.0.61_cudnn7.1.2_2    pytorch
PyWavelets                1.1.1                     <pip>
PyYAML                    5.2                       <pip>
readline                  7.0                  h7b6447c_5  
requests                  2.22.0                    <pip>
requests-oauthlib         1.3.0                     <pip>
rsa                       4.0                       <pip>
scikit-image              0.16.2                    <pip>
scipy                     1.4.1                     <pip>
setuptools                44.0.0                   py36_0  
Shapely                   1.6.4.post2               <pip>
six                       1.13.0                   py36_0  
sqlite                    3.30.1               h7b6447c_0  
tensorboard               2.1.0                     <pip>
tensorboardX              1.8                       <pip>
tk                        8.6.8                hbc83047_0  
torch                     1.1.0                     <pip>
torchvision               0.2.1                     <pip>
torchvision               0.2.2                      py_3    pytorch
tqdm                      4.40.1                    <pip>
traitlets                 4.3.3                    py36_0  
urllib3                   1.25.7                    <pip>
wcwidth                   0.1.7                    py36_0  
Werkzeug                  0.16.0                    <pip>
wheel                     0.33.6                   py36_0  
xz                        5.2.4                h14c3975_4  
zlib                      1.2.11               h7b6447c_3  
zstd                      1.3.7                h0b5b093_0  

Installing the software directly: pip install tensorboardX == 1.8
without the latest version of the default installation.
May also pip install 'tensorboardX <1.9'. 1.9 less than the installed version.
There are two errors:

2020-01-18 16:23:24,753 DBNet.pytorch ERROR: Traceback (most recent call last):
  File "/data_1/Yang/project/2019/project/DBNet.pytorch/DBNet.pytorch-master/base/base_trainer.py", line 77, in __init__
    self.writer.add_graph(self.model, dummy_input)
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/tensorboardX/writer.py", line 774, in add_graph
    self._get_file_writer().add_graph(graph(model, input_to_model, verbose, **kwargs))
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 292, in graph
    list_of_nodes, node_stats = parse(graph, args)
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 227, in parse
    if node.debugName() == 'self':
AttributeError: 'torch._C.Value' object has no attribute 'debugName'

2020-01-18 16:23:24,753 DBNet.pytorch WARNING: add graph to tensorboard failed
2020-01-18 16:23:24,756 DBNet.pytorch INFO: train dataset has 889 samples,297 in dataloader, validate dataset has 111 samples,111 in dataloader
Traceback (most recent call last):
  File "tools/train.py", line 74, in <module>
    main(config)
  File "tools/train.py", line 58, in main
    trainer.train()
  File "/data_1/Yang/project/2019/project/DBNet.pytorch/DBNet.pytorch-master/base/base_trainer.py", line 103, in train
    self.epoch_result = self._train_epoch(epoch)
  File "/data_1/Yang/project/2019/project/DBNet.pytorch/DBNet.pytorch-master/trainer/trainer.py", line 46, in _train_epoch
    for i, batch in enumerate(self.train_loader):
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
  File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
TypeError: 'NoneType' object is not callable

首先这个,File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 227, in parse
if node.debugName() == 'self':
AttributeError: 'torch._C.Value' object has no attribute 'debugName'

It looks like tensorboardX wrong version, Baidu, really, saying we should release the entire 1.8.conda list to show I was 1.9, and then knock:
PIP install tensorboardX == 1.8, shown below:
Requirement already satisfied: Six in / Data_1 / Yang /software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages (from tensorboardX == 1.8) (1.13.0)
the Requirement already satisfied: Protobuf> = 3.2.0 in / Data_1 / Yang / software_install / Anaconda1105 / Envs / dbnet / lib / python3.6 / Site-Packages (from tensorboardX == 1.8) (3.11.2)
the Requirement already satisfied: numpy in /data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/ Packages-Site (from tensorboardX == 1.8) (1.17.4)
the Requirement already satisfied: the setuptools in /data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages (from Protobuf> = 3.2. 0-> tensorboardX == 1.8) (44.0.0.post20200106 )
Installing collected packages: tensorboardX
Found existing installation: tensorboardX 1.9
Uninstalling tensorboardX-1.9:
Successfully uninstalled tensorboardX-1.9
Successfully installed tensorboardX-1.8

Direct automatically unloading equipment 1.8 1.9

Then re-training, is really only the last of that mistake.

Traceback (most recent call last):
File "tools/train.py", line 74, in
main(config)
File "tools/train.py", line 58, in main
trainer.train()
File "/data_1/Yang/project/2019/project/DBNet.pytorch/DBNet.pytorch-master/base/base_trainer.py", line 103, in train
self.epoch_result = self._train_epoch(epoch)
File "/data_1/Yang/project/2019/project/DBNet.pytorch/DBNet.pytorch-master/trainer/trainer.py", line 46, in _train_epoch
for i, batch in enumerate(self.train_loader):
File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/data_1/Yang/software_install/Anaconda1105/envs/dbnet/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
TypeError: 'NoneType' object is not callable

In github that someone answer this question, https: //github.com/WenmuZhou/DBNet.pytorch/issues/4
in-Master DBNet.pytorch / data_loader / the init .py, Line 74
IF 'collate_fn' not in config [ ' Loader '] or config [' Loader '] [' collate_fn '] None or IS len (config [' Loader '] [' collate_fn ']) == 0:
#config [' Loader '] [' collate_fn '] = None # here has to changle, ========= here to change the following, or may be passed directly into the assigned None ====
config [ 'Loader'] [ 'collate_fn'] = torch.utils. data.dataloader.default_collate
the else:
config [ 'Loader'] [ 'collate_fn'] = the eval (config [ 'Loader'] [ 'collate_fn']) ()

_dataset = get_dataset(data_path=data_path, module_name=dataset_name, transform=img_transfroms, dataset_args=dataset_args)
sampler = None
if distributed:
from torch.utils.data.distributed import DistributedSampler
# 3)使用DistributedSampler
sampler = DistributedSampler(_dataset)
config['loader']['shuffle'] = False
config['loader']['pin_memory'] = True
loader = DataLoader(dataset=_dataset, sampler=sampler, **config['loader'])
return @loader

So, retraining, to ok! ! !
Quickly training, and training with their own data and see!

Guess you like

Origin www.cnblogs.com/yanghailin/p/12209685.html