Tensorflow Getting Started --- verification code recognition

Surely we have such a feeling, obviously I have learned tensorflow long time, but just do not think they can not utilize, today you will follow my footsteps to learn, Tensorflow verification code recognition learning

This article code is borrowed this man, and improve some of these issues are welcome to study and discuss!

Links: https://zhuanlan.zhihu.com/p/36979787

Well, ado we start! ! !

1, organize and generate their own data sets

About TensorFlow read data, the official website gives three ways:

  • Supply data (Feeding): at each step TensorFlow running, let python code to supply data.
  • Reading data from the file: start in FIG TensorFlow, so that an input line to read data from the file.
  • Preloaded data: TensorFlow defined constant or variable in the figures to save all the data (only for the case of a small amount of data).

The amount of data is small, it may be generally selected to load data directly into memory, then the sub-network training input batch (tip: When using this method, in conjunction with the use of more compact yeild). However, if a large amount of data, this method is not applicable. Too memory consumption, it is preferable to use at this time queue queue TensorFlow provided, i.e. the second method: reading data from a file. For some particular reading, such as csv file format, the official website of the relevant description, where we learn a more versatile and efficient method of reading, that the use of the default standard format TensorFlow --TFRecords.

We used here Tfrecord way production data collection

1), suitable to establish the path: ... / dataset / train (test) / 0 (1,2,3,4,5,6,7,8,9)

According to this dataset to build the next train, test two folders, and then train, each built under test folders 10 file folders 0-9

Create a folder can change themselves after reading the code
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
2), using the library generated verification code captcha picture

New create_images.py file, execute

from captcha.image import ImageCaptcha
import os


def gen_captcha_text_and_image(path, i, a):
    image_ = ImageCaptcha()
    captcha_text = str(i)
    print(captcha_text)
    #captcha = image_.generate(captcha_text)
    image_.write(a, path + captcha_text + '.png', format='png')

def train_images():
	for i in range(2000):
		for j in range(10):
			path = "./dataset/train/"
			gen_captcha_text_and_image(path + str(j) +'/', i, str(j))
		i = i + 1


def test_images():
	for i in range(1000):
		for j in range(10):
			path = "./dataset/test/"
			gen_captcha_text_and_image(path + str(j) +'/', i, str(j))
		i = i + 1


train_images()
test_images()

Such a directory named directory under the train 0-9 of 10 Folders Each folder has a 2000 picture corresponding to
test 10 documents 0-9 of directory folder for 1000

Here Insert Picture Description

3), and generates Tfrecord File, corresponding to good images corresponding label

Image_to_tfrecord.py create a new file, execute

Here we must remember that, because os.listdir () will not sort the files in the directory, the directory is random read, so you need to remove the suffix of the file name and then be sorted, otherwise the training set of images and information does not match the label , training would not make any sense

import os
import tensorflow as tf
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np

cwd = "./dataset/train/"
classes = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

writer= tf.python_io.TFRecordWriter("0_to_9_train.tfrecords") #要生成的文件

for index, name in enumerate(classes):
    classes_path = cwd + name +'/'

    # 一下的三行是用于对目录的排序,因为os.listdir()不会对目录进行排序,所以标签无法对应上,会导致训练出现问题
    rootpath = os.listdir(classes_path)
    rootpath.sort(key=lambda x: int(x[:-4]))
    for img_name in rootpath:

        print("img_name", img_name)
        img_path = classes_path + img_name   #生成一个带有所有图片文件名的列表

        print("img_path:", img_path)
        img = Image.open(img_path)

        img_raw = img.tobytes()   #将图片变成二进制格式
        example = tf.train.Example(features=tf.train.Features(feature={
            "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
            'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
        }))  # example对象对label和image数据进行封装

        writer.write(example.SerializeToString())  # 序列化为字符串

writer.close()

4), the tag read Tfrecord

Create a new file tfrecord_to_image.py no need to perform, waiting for the main document calls

There is no problem

import numpy as np
import tensorflow as tf


def read_and_decode(filename): # 读入tfrecords
    filename_queue = tf.train.string_input_producer([filename], shuffle=True)  # 生成一个queue队列
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)  # 返回文件名和文件
    features = tf.parse_single_example(serialized_example,
                                       features={
                                           'label': tf.FixedLenFeature([], tf.int64),
                                           'img_raw': tf.FixedLenFeature([], tf.string),
                                       })  # 将image数据和label取出来

    img = tf.decode_raw(features['img_raw'], tf.uint8)
    img = tf.reshape(img, [120, 120, 3])
    img = tf.cast(img, tf.float32) # 在流中抛出img张量
    label = tf.cast(features['label'], tf.int32)  # 在流中抛出label张量

    return img, label

So far our preparation has been completed

2, a simplified version of the network structure training AlexNet

CNN parts of the network do we explain here, this article is more to explain the process to establish a training model

Note that this line of code

img_batch, label_batch = tf.train.shuffle_batch([img, label],
                                                batch_size=batch_size, capacity=20000,  # 因为我们这里有两万张图片,每个数字有2000张所以需要选择20000,不然会导致缺失数字
                                                min_after_dequeue=1)
  • Minimum number of files in the queue - min_after_dequeue
  • The maximum number of files in the queue - capacity

Special Note: We are here training set twenty thousand pictures, in order to complete the reading into the queue to each number, you need to set the capacity to more than 20,000, or will be missing data set

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import time
import tfrecord_to_image
# https://www.cnblogs.com/cvtoEyes/p/8981994.html
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# https://github.com/yhlleo/tensorflow.cifar10/blob/master/cifar10_input.py


batch_size = 10


def weight_varialbe(shape):
    return tf.Variable(tf.truncated_normal(shape=shape, stddev=0.1))


def bias_variable(shape):
    return tf.constant(0.1, shape=shape)


def conv_2d(input, w):
    return tf.nn.relu(tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding='VALID'))


def max_pool(input):
    return tf.nn.max_pool(input, ksize=[1, 5, 5, 1], strides=[1, 2, 2, 1], padding='VALID')


# 全连接层
def full_connect(input, output_depth):
    input_depth = input.get_shape().as_list()[-1]
    # print(input.get_shape().as_list())

    w = weight_varialbe([input_depth, output_depth])
    # print(w.get_shape().as_list())

    b = bias_variable([output_depth])
    # print('ff')
    # print(b.get_shape().as_list())

    fc = tf.nn.bias_add(tf.matmul(input, w), b)
    return tf.nn.relu(fc)


def full_connect_final(input, output_depth):
    input_depth = input.get_shape().as_list()[-1]

    fc = tf.nn.bias_add(tf.matmul(input, weight_varialbe([input_depth, output_depth])), bias_variable([output_depth]))
    return fc

# 第一层卷积池化层

# 下载数据


train_imgs = tf.placeholder(tf.float32, [batch_size, 120, 120, 3])
train_labels = tf.placeholder(tf.int32, [batch_size])



# 第一层卷积
con_w1 = weight_varialbe([11, 11, 3, 64])
net = conv_2d(train_imgs, con_w1)
print("第一层卷积:", net)
# 第二层池化
net = max_pool(net)
print("第二层池化:", net)
# 第三层卷积
con_w2 = weight_varialbe([11, 11, 64, 64])
net = conv_2d(net, con_w2)
print("第三层卷积:", net)


# 第四层卷积
con_w3 = weight_varialbe([11, 11, 64, 64])
net = conv_2d(net, con_w3)
print("第四层卷积:", net)

# 第五层池化
net = max_pool(net)
print("第五层池化:", net)

net = tf.reshape(net, [-1, 15*15*64])

# 第五六层全连接层
net = full_connect(net, 384)

# 第七层全连接层
net = full_connect(net, 192)

# 第八层全连接层
net = full_connect_final(net, 10)

print("net shape:", net.shape)


train_loss = tf.losses.sparse_softmax_cross_entropy(labels=train_labels, logits=net)
lr = 0.0001
opt = tf.train.AdamOptimizer(lr)
train_op = opt.minimize(train_loss)

predict = tf.argmax(net, axis=-1,output_type=tf.int32)

train_acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(net, axis=-1,output_type=tf.int32), train_labels), tf.float32))


img, label = tfrecord_to_image.read_and_decode("0_to_9_train.tfrecords")
img_test, label_test = tfrecord_to_image.read_and_decode("0_to_9_test.tfrecords")

#使用shuffle_batch可以随机打乱输入
img_batch, label_batch = tf.train.shuffle_batch([img, label],
                                                batch_size=batch_size, capacity=20000,  # 因为我们这里有两万张图片,每个数字有2000张所以需要选择20000,不然会导致缺失数字
                                                min_after_dequeue=1)
print("img.shape", img_batch.shape)
print("label.shape", label_batch.shape)
# img_test, label_test = tf.train.shuffle_batch([img_test, label_test],
#                                                 batch_size=batch_size, capacity=6000,
#                                                 min_after_dequeue=1000)

init = tf.global_variables_initializer()

sess = tf.InteractiveSession()

# gragh_writer = tf.summary.FileWriter('.', sess.graph)

sess.run(init)

coord = tf.train.Coordinator() #创建一个协调器,管理线程
threads = tf.train.start_queue_runners(coord=coord)  #启动QueueRunner, 此时文件名队列已经进队。


for i in range(2000):
    batch0, batch1 = sess.run([img_batch, label_batch])
    sess.run(train_op, feed_dict={train_imgs : batch0, train_labels : batch1})
    if i%10 == 0:
        get_train_loss = sess.run(train_loss, feed_dict={train_imgs : batch0, train_labels : batch1})
        train_accuy = sess.run(train_acc, feed_dict={train_imgs : batch0, train_labels : batch1})
        print("train_loss", get_train_loss)
        print("STEP %d Accuray %g" % (i, train_accuy))
        print(sess.run(predict, feed_dict={train_imgs : batch0, train_labels : batch1}))
        print(batch1)
        fig, axes = plt.subplots(ncols=5, nrows=2)
        for ind, (image, label) in enumerate(zip(batch0, batch1)):
            image = image / 255.
            row = ind // 5
            col = ind % 5
            axes[row][col].imshow(image, cmap='gray')  # 灰度图
            axes[row][col].axis('off')
            axes[row][col].set_title('%d' % label)
        plt.show()

Display:
Here Insert Picture Description
Here To check whether the read status success I drew a diagram of a portion, if we want the training, please cut the following code:

fig, axes = plt.subplots(ncols=5, nrows=2)
for ind, (image, label) in enumerate(zip(batch0, batch1)):
    image = image / 255.
    row = ind // 5
    col = ind % 5
    axes[row][col].imshow(image, cmap='gray')  # 灰度图
    axes[row][col].axis('off')
    axes[row][col].set_title('%d' % label)
plt.show()

Because computers are equipped with cpu version tensorflow, and generates a verification code image size format (120,120,3), when batch_size take time great, my computer can not run, but the program step is no problem

If there are a lot of people like me, the difference between the computer and then offer the following solutions:
1, more streamlined AlexNet network model, minus the portion of the convolution layer
2, batch_size take smaller
3, the generated code smaller picture, the specific operation as
mouseover ImageCaptcha (), the definition of inside at the object

Here Insert Picture Description
to modify the width and height can be here, but need width == height
Here Insert Picture Description
hope that we progress together! ! ! ! !

Released seven original articles · won praise 2 · Views 521

Guess you like

Origin blog.csdn.net/qq_41744697/article/details/104543839