Super detailed training DBnet
Paper link: https://arxiv.org/pdf/1911.08947.pdf
Project link: https://github.com/WenmuZhou/DBNet.pytorch
Project Introduction
DBnet is a text detection method based on segmentation, which can better detect text of different shapes in natural scenes
-
network structure
-
Import: After downloading the code, unzip it and import it into pycharm
-
Modify part of the code:
Due to the author's input error, some modifications need to be made in the code
3.1 Change '--save_resut' to '--save_result'
3.2
There will also be No module named 'torchvision.models.utils' error during training, so only You need to change from torchvision.models.utils import load_state_dict_from_url to from torch.hub import load_state_dict_from_url
-
Environment installation
Open the README and find the following code:
pip install -r requirement.txt
and enter it in the terminal and press Enter. However
, it is faster for me to enter the command one by one to install. Open
requirement.txt and enter the corresponding command in the terminal. :
For example: pip install addict -i https://mirrors.aliyun.com/pypi/simple/
input and press Enter to install
-
Dataset: The article uses the icdar2015 dataset
Link: https://rrc.cvc.uab.es/?ch=4&com=tasks
Select Task 4.1: Text Localization and download it.
However, you need to register with your email address
and unzip it to corresponding folder, and put it into the project as follows:
-
Data processing: The author is processing it under the Ubuntu system, and I am using the windows system, so enter the following data code
import os
def get_images(img_path):
'''
find image files in data path
:return: list of files found
'''
files = []
exts = ['jpg', 'png', 'jpeg', 'JPG', 'PNG']
for parent, dirnames, filenames in os.walk(img_path):
for filename in filenames:
for ext in exts:
if filename.endswith(ext):
files.append(os.path.join(parent, filename))
break
print('Find {} images'.format(len(files)))
return sorted(files)
def get_txts(txt_path):
'''
find gt files in data path
:return: list of files found
'''
files = []
exts = ['txt']
for parent, dirnames, filenames in os.walk(txt_path):
for filename in filenames:
for ext in exts:
if filename.endswith(ext):
files.append(os.path.join(parent, filename))
break
print('Find {} txts'.format(len(files)))
return sorted(files)
if __name__ == '__main__':
import json
img_train_path = r'F:\apy\DBNet\datasets\train\img\ch4_training_images'
img_test_path = r'F:\apy\DBNet\datasets\test\img\ch4_test_images'
train_files = get_images(img_train_path)
test_files = get_images(img_test_path)
txt_train_path = r'F:\apy\DBNet\datasets\train\gt\ch4_training_localization_transcription_gt'
txt_test_path = r'F:\apy\DBNet\datasets\test\gt\Challenge4_Test_Task1_GT'
train_txts = get_txts(txt_train_path)
test_txts = get_txts(txt_test_path)
n_train = len(train_files)
n_test = len(test_files)
assert len(train_files) == len(train_txts) and len(test_files) == len(test_txts)
# with open('train.txt', 'w') as f:
with open('./datasets/train.txt', 'w') as f:
for i in range(n_train):
line = train_files[i] + '\t' + train_txts[i] + '\n'
f.write(line)
with open('./datasets/test.txt', 'w') as f:
for i in range(n_test):
line = test_files[i] + '\t' + test_txts[i] + '\n'
f.write(line)
And modify the corresponding file path in the code
-
Training:
First configure the corresponding parameters in config/icdar2015_resnet18_FPN_DBhead_polyLR.yaml
Input in terminal:
python tools/train.py --config_file "config/icdar2015_resnet18_FPN_DBhead_polyLR.yaml"
About three days of training
-
Test: Modify the following parameters in tools/predict.py and run it.
The result is as follows:
The above is the complete training and testing process! hope its good for U.S!
Reference link:
[1]: https://bbs.huaweicloud.com/blogs/345205
[2]: https://github.com/WenmuZhou/DBNet.pytorch
[3]: https://rrc.cvc.uab .es/?ch=4&com=tasks