Super detailed record training and testing DBnet (pytorch) in pycharm

Super detailed training DBnet


Paper link: https://arxiv.org/pdf/1911.08947.pdf
Project link: https://github.com/WenmuZhou/DBNet.pytorch

Project Introduction

DBnet is a text detection method based on segmentation, which can better detect text of different shapes in natural scenes

  1. network structure
    insert image description here

  2. Import: After downloading the code, unzip it and import it into pycharm
    insert image description here

  3. Modify part of the code:
    Due to the author's input error, some modifications need to be made in the code
    3.1 Change '--save_resut' to '--save_result' insert image description here
    3.2
    There will also be No module named 'torchvision.models.utils' error during training, so only You need to change from torchvision.models.utils import load_state_dict_from_url to from torch.hub import load_state_dict_from_url
    insert image description here

  4. Environment installation
    Open the README and find the following code:
    pip install -r requirement.txt
    and enter it in the terminal and press Enter. However
    insert image description here
    insert image description here
    , it is faster for me to enter the command one by one to install. Open
    requirement.txt and enter the corresponding command in the terminal. :
    For example: pip install addict -i https://mirrors.aliyun.com/pypi/simple/
    insert image description here
    input and press Enter to install
    insert image description here

  5. Dataset: The article uses the icdar2015 dataset
    Link: https://rrc.cvc.uab.es/?ch=4&com=tasks
    Select Task 4.1: Text Localization and download it.
    However, you need to register with your email address insert image description here
    and unzip it to corresponding folder, and put it into the project as follows:
    insert image description here

  6. Data processing: The author is processing it under the Ubuntu system, and I am using the windows system, so enter the following data code

import os
def get_images(img_path):
    '''
    find image files in data path
    :return: list of files found
    '''
    files = []
    exts = ['jpg', 'png', 'jpeg', 'JPG', 'PNG']
    for parent, dirnames, filenames in os.walk(img_path):
        for filename in filenames:
            for ext in exts:
                if filename.endswith(ext):
                    files.append(os.path.join(parent, filename))
                    break
    print('Find {} images'.format(len(files)))
    return sorted(files)

def get_txts(txt_path):
    '''
    find gt files in data path
    :return: list of files found
    '''
    files = []
    exts = ['txt']
    for parent, dirnames, filenames in os.walk(txt_path):
        for filename in filenames:
            for ext in exts:
                if filename.endswith(ext):
                    files.append(os.path.join(parent, filename))
                    break
    print('Find {} txts'.format(len(files)))
    return sorted(files)

if __name__ == '__main__':
    import json

    img_train_path = r'F:\apy\DBNet\datasets\train\img\ch4_training_images'
    img_test_path = r'F:\apy\DBNet\datasets\test\img\ch4_test_images'
    train_files = get_images(img_train_path)
    test_files = get_images(img_test_path)

    txt_train_path = r'F:\apy\DBNet\datasets\train\gt\ch4_training_localization_transcription_gt'
    txt_test_path = r'F:\apy\DBNet\datasets\test\gt\Challenge4_Test_Task1_GT'
    train_txts = get_txts(txt_train_path)
    test_txts = get_txts(txt_test_path)
    n_train = len(train_files)
    n_test = len(test_files)
    assert len(train_files) == len(train_txts) and len(test_files) == len(test_txts)
    # with open('train.txt', 'w') as f:
    with open('./datasets/train.txt', 'w') as f:
        for i in range(n_train):
            line = train_files[i] + '\t' + train_txts[i] + '\n'
            f.write(line)
    with open('./datasets/test.txt', 'w') as f:
        for i in range(n_test):
            line = test_files[i] + '\t' + test_txts[i] + '\n'
            f.write(line)

And modify the corresponding file path in the code
insert image description here

  1. Training:
    First configure the corresponding parameters in config/icdar2015_resnet18_FPN_DBhead_polyLR.yaml
    insert image description here
    Input in terminal:
    python tools/train.py --config_file "config/icdar2015_resnet18_FPN_DBhead_polyLR.yaml"
    About three days of training
    insert image description here

  2. Test: Modify the following parameters in tools/predict.py and run it.
    insert image description here
    The result is as follows:
    insert image description here
    insert image description here
    The above is the complete training and testing process! hope its good for U.S!

Reference link:
[1]: https://bbs.huaweicloud.com/blogs/345205
[2]: https://github.com/WenmuZhou/DBNet.pytorch
[3]: https://rrc.cvc.uab .es/?ch=4&com=tasks

Guess you like

Origin blog.csdn.net/qq_44961737/article/details/128272399