Day03- convolution neural network theory and use

Day03- convolution neural network theory and use

Job Description

Today's project is based on actual convolution neural network LeNet of "license plate recognition."

Work requirements:

  • 1, according to the contents learned in class, and run through the model constructed LeNet. On this basis, try another network configuration.
  • 2, thinking and hands-parameter adjustment, optimization, improve the accuracy of the test set.

Courseware and link data set is available to preface looking for introduction before the formal learning

All we need is the data contained in the day03 folder. characterData.zip is a data set that we need to use, CarID.png is used to test the effect of the final picture.

Sample Code

  1. Import packages needed
import numpy as np
import paddle as paddle
import paddle.fluid as fluid
from PIL import Image
import cv2
import matplotlib.pyplot as plt
import os
from multiprocessing import cpu_count
from paddle.fluid.dygraph import Pool2D,Conv2D
# from paddle.fluid.dygraph import FCfrom paddle.fluid.dygraph import Linear
  1. Generate a list of license plate character image
data_path = '/home/aistudio/data'
character_folders = os.listdir(data_path)
label = 0
LABEL_temp = {}
if(os.path.exists('./train_data.list')):
    os.remove('./train_data.list')
if(os.path.exists('./test_data.list')):
    os.remove('./test_data.list')
for character_folder in character_folders:
    with open('./train_data.list', 'a') as f_train:
        with open('./test_data.list', 'a') as f_test:
            if character_folder == '.DS_Store' or character_folder == '.ipynb_checkpoints' or character_folder == 'data23617':
                continue
            print(character_folder + " " + str(label))
            LABEL_temp[str(label)] = character_folder #存储一下标签的对应关系
            character_imgs = os.listdir(os.path.join(data_path, character_folder))
            for i in range(len(character_imgs)):
                if i%10 == 0: 
                    f_test.write(os.path.join(os.path.join(data_path, character_folder), character_imgs[i]) + "\t" + str(label) + '\n')
                else:
                    f_train.write(os.path.join(os.path.join(data_path, character_folder), character_imgs[i]) + "\t" + str(label) + '\n')
    label = label + 1
print('图像列表已生成')
  1. Spend the reader step-generated image list defines the license plate character training set and test set
def data_mapper(sample):
    img, label = sample
    img = paddle.dataset.image.load_image(file=img, is_color=False)
    img = img.flatten().astype('float32') / 255.0
    return img, label
def data_reader(data_list_path):
    def reader():
        with open(data_list_path, 'r') as f:
            lines = f.readlines()
            for line in lines:
                img, label = line.split('\t')
                yield img, int(label)
    return paddle.reader.xmap_readers(data_mapper, reader, cpu_count(), 1024)
  1. Creating a Data Provider
# 用于训练的数据提供器
train_reader = paddle.batch(reader=paddle.reader.shuffle(reader=data_reader('./train_data.list'), buf_size=512), batch_size=128)
# 用于测试的数据提供器
test_reader = paddle.batch(reader=data_reader('./test_data.list'), batch_size=128)
  1. Define the network (completion code)

class MyLeNet(fluid.dygraph.Layer):
    def __init__(self):
        super(MyLeNet,self).__init__()
        self.hidden1_1 = Conv2D()
        self.hidden1_2 = Pool2D()
        self.hidden2_1 = Conv2D()
        self.hidden2_2 = Pool2D()
        self.hidden3 = Conv2D()
        self.hidden4 = Linear()
    
    def forward(self,input):
    
    
    
    
        return y
  1. Training with a dynamic map
with fluid.dygraph.guard():
    model=MyLeNet() #模型实例化
    model.train() #训练模式
    opt=fluid.optimizer.SGDOptimizer(learning_rate=0.001, parameter_list=model.parameters())#优化器选用SGD随机梯度下降,学习率为0.001.
    epochs_num=200 #迭代次数为200
    
    for pass_num in range(epochs_num):
        
        for batch_id,data in enumerate(train_reader()):
            images=np.array([x[0].reshape(1,20,20) for x in data],np.float32)
            labels = np.array([x[1] for x in data]).astype('int64')
            labels = labels[:, np.newaxis]
            image=fluid.dygraph.to_variable(images)
            label=fluid.dygraph.to_variable(labels)
            
            predict=model(image)#预测
            
            loss=fluid.layers.cross_entropy(predict,label)
            avg_loss=fluid.layers.mean(loss)#获取loss值
            
            acc=fluid.layers.accuracy(predict,label)#计算精度
            
            if batch_id!=0 and batch_id%50==0:
                print("train_pass:{},batch_id:{},train_loss:{},train_acc:{}".format(pass_num,batch_id,avg_loss.numpy(),acc.numpy()))
            
            avg_loss.backward()
            opt.minimize(avg_loss)
            model.clear_gradients()            
            
    fluid.save_dygraph(model.state_dict(),'MyLeNet')#保存模型
  1. Model validation
with fluid.dygraph.guard():
    accs = []
    model=MyLeNet()#模型实例化
    model_dict,_=fluid.load_dygraph('MyLeNet')
    model.load_dict(model_dict)#加载模型参数
    model.eval()#评估模式
    for batch_id,data in enumerate(test_reader()):#测试集
        images=np.array([x[0].reshape(1,20,20) for x in data],np.float32)
        labels = np.array([x[1] for x in data]).astype('int64')
        labels = labels[:, np.newaxis]
            
        image=fluid.dygraph.to_variable(images)
        label=fluid.dygraph.to_variable(labels)
            
        predict=model(image)#预测
        acc=fluid.layers.accuracy(predict,label)
        accs.append(acc.numpy()[0])
        avg_acc = np.mean(accs)
    print(avg_acc)
  1. License plate images are processed, divided the license plate of each character and save
license_plate = cv2.imread('CarID.png')
gray_plate = cv2.cvtColor(license_plate, cv2.COLOR_RGB2GRAY)
ret, binary_plate = cv2.threshold(gray_plate, 175, 255, cv2.THRESH_BINARY)
result = []
for col in range(binary_plate.shape[1]):
    result.append(0)
    for row in range(binary_plate.shape[0]):
        result[col] = result[col] + binary_plate[row][col]/255
character_dict = {}
num = 0
i = 0while i < len(result):
    if result[i] == 0:
        i += 1
    else:
        index = i + 1
        while result[index] != 0:
            index += 1
        character_dict[num] = [i, index-1]
        num += 1
        i = index

for i in range(8):
    if i==2:
        continue
    padding = (170 - (character_dict[i][1] - character_dict[i][0])) / 2
    ndarray = np.pad(binary_plate[:,character_dict[i][0]:character_dict[i][1]], ((0,0), (int(padding), int(padding))), 'constant', constant_values=(0,0))
    ndarray = cv2.resize(ndarray, (20,20))
    cv2.imwrite('./' + str(i) + '.png', ndarray)
    
def load_image(path):
    img = paddle.dataset.image.load_image(file=path, is_color=False)
    img = img.astype('float32')
    img = img[np.newaxis, ] / 255.0
    return img
  1. The label conversion
print('Label:',LABEL_temp)
match = {'A':'A','B':'B','C':'C','D':'D','E':'E','F':'F','G':'G','H':'H','I':'I','J':'J','K':'K','L':'L','M':'M','N':'N',
        'O':'O','P':'P','Q':'Q','R':'R','S':'S','T':'T','U':'U','V':'V','W':'W','X':'X','Y':'Y','Z':'Z',
        'yun':'云','cuan':'川','hei':'黑','zhe':'浙','ning':'宁','jin':'津','gan':'赣','hu':'沪','liao':'辽','jl':'吉','qing':'青','zang':'藏',
        'e1':'鄂','meng':'蒙','gan1':'甘','qiong':'琼','shan':'陕','min':'闽','su':'苏','xin':'新','wan':'皖','jing':'京','xiang':'湘','gui':'贵',
        'yu1':'渝','yu':'豫','ji':'冀','yue':'粤','gui1':'桂','sx':'晋','lu':'鲁',
        '0':'0','1':'1','2':'2','3':'3','4':'4','5':'5','6':'6','7':'7','8':'8','9':'9'}
L = 0
LABEL ={}

for V in LABEL_temp.values():
    LABEL[str(L)] = match[V]
    L += 1
print(LABEL)
  1. Construction predicted dynamic process of FIG.
with fluid.dygraph.guard():
    model=MyLeNet()#模型实例化
    model_dict,_=fluid.load_dygraph('MyLeNet')
    model.load_dict(model_dict)#加载模型参数
    model.eval()#评估模式
    lab=[]
    for i in range(8):
        if i==2:
            continue
        infer_imgs = []
        infer_imgs.append(load_image('./' + str(i) + '.png'))
        infer_imgs = np.array(infer_imgs)
        infer_imgs = fluid.dygraph.to_variable(infer_imgs)
        result=model(infer_imgs)
        lab.append(np.argmax(result.numpy()))
# print(lab)


display(Image.open('CarID.png'))
print('\n车牌识别结果为:',end='')
for i in range(len(lab)):
    print(LABEL[str(lab[i])],end='')

finish homework

LeNet defined network:

We use the convolution three layers, two layers pool, a linear elements, the excitation function of the output layer, still using the 'Softmax' classification function.

About convolution layer parameters and pooling layer can refer to this blog learning convolutional neural network parameters Meaning

class MyLeNet(fluid.dygraph.Layer):
    def __init__(self):
        super(MyLeNet,self).__init__()
        self.hidden1_1 = Conv2D(1, 28, 5, 1)    # 通道数,卷积核个数,卷积核大小,填充数
        self.hidden1_2 = Pool2D(pool_size=2, pool_type='max', pool_stride=1)
        self.hidden2_1 = Conv2D(28, 32, 3, 1)
        self.hidden2_2 = Pool2D(pool_size=2, pool_type='max', pool_stride=1)
        self.hidden3 = Conv2D(32, 32, 3, 1)
        self.hidden4 = Linear(32*10*10, 65, act='softmax')
    
    def forward(self,input):
        x = self.hidden1_1(input)
        x = self.hidden1_2(x)
        x = self.hidden2_1(x)
        x = self.hidden2_2(x)
        x = self.hidden3(x)
        x = fluid.layers.reshape(x, shape=[-1, 32*10*10])
        y = self.hidden4(x)
        return y

Enter the unity of the picture size is 1 * 20 * 20, in the group and we discussed how inside the parameters are calculated

Entry Export
Input layer (20 * 20 * 1) 1* 20* 20
Convolution layer 1 (28 * 5 * 5, stride = 1) 28* 16* 16(16=(20-5)/1+1)
Cell layer 1 (2 * 2, stride = 1) 28* 15* 15(15=(16-2)/1+1)
Convolution layer 2 (32 * 3 * 3, stride = 1) 32* 13* 13(13=(15-3)/1+1)
Pool layer 2 (2 * 2, stride = 1) 32* 12* 12(12=(13-2)/1+1)
Convolution layer 3 (32 * 3 * 3, stride = 1) 32* 10* 10(10=(12-3)/1+1)
Linear layer 4 (32 * 10 * 10 -> 65) 10

If you're on Baidu AIStudio platform, directly complement the code can be run.

If the code is copied to the local run, also you need to modify a place to run. The first step 10 display (Image.open ( 'CarID.png')) into

im = Image.open('CarID.PNG')
im.show()

Convolution neural network really more powerful than the depth of the fully connected neural network. The LeNet network CNN is a classic model. Iterations 20 times the accuracy reached 0.84, while Day02's DNN need to iterate 200 times to achieve this accuracy. When tested with CarID.jpg picture, the results are generally accurate, there is little wrong.
day31

Continue iterative training, LeNet 50 iterations accuracy rate of 0.92, the accuracy rate is quite high; accurate reached 0.945 after 100 iterations, the accuracy can be a good identification of the license plate; iteration accuracy after 200 is 0.965, upgrade is not great, and perhaps other parameters can be considered optimized.
day32

Performed with the license plate picture test, the result is completely correct, this LeNet network really powerful.
day33

Of course, we can also continue to increase convolution layers and layers pool training, deep layers of the accuracy certainly higher, but also pay attention to prevent over-fitting.

Published 61 original articles · won praise 25 · views 7172

Guess you like

Origin blog.csdn.net/qq_42582489/article/details/105317193