10 minutes to teach you to get simple handwriting recognition with python 30 lines of code!

To directly download the code file, pay attention to our public No. Oh! View history messages can be!

Handwritten notes or electronic notes is good?

Graduation season just ended, Seeing 2018 meng new is coming soon, we music.brothers old bacon small series for learning, ready to compile a set of brains University Cheats, this is not just getting a start on He met the challenge - to take notes in the end is good or electronic notes handwritten notes is good?

Clever little friends should perhaps hate Xiao Bian, is not there an electronic handwritten notes it! Well, wit, as I thought about this, how could it!

We note in addition to the provincial electronic paper, often also want to have notes organize and search functions, electronic handwritten notes can not be achieved if the handwriting recognition search function, it really is only a province of the paper. To this end, this small series personally experienced GoodNotes handwritten notes application, even though it can do handwriting recognition search, but need to write neatly, not even a pen, writing and moving groups would be able to try to restrain him, including a number of small series, including the bar.

image

So in the end how to achieve it handwriting recognition? This issue magician to come to church you how to achieve the seemingly advanced handwriting recognition technology with simple programming. Refer to some online tutorial, we will demonstrate the use tensorflow achieve MNIST handwriting recognition example.

First you learn to mind map of the content of the article, if you have not talked about the details, please refer to the study itself:

MNIST data set from the National Institute of Standards and Technology, National Institute of Standards and Technology (NIST). Training set (training set) consists of figures from 250 different people's handwriting, of which 50% are high school students, 50% come from the Census Bureau (the Census Bureau) staff. Test set (test set) is the same proportion of handwritten digital data.

Tensorflow give you about it.

tensorflow是谷歌于2015年11月9日正式开源的计算框架。tensorflow计算框架可以很好地支持深度学习的各种算法,但它的应用也不限于深度学习,是由Jeff Dean领头的谷歌大脑团队基于谷歌内部第一代深度学习系统DistBelief改进而来的通用计算框架。

我们通过基于python3的编程语言调用tensorflow这一框架。

下载方式参考如下:

入门捷径:线性回归

我们看一个最简单的机器学习模型,线性回归的例子。

狭义的最小二乘方法,是线性假设下的一种有闭式解的参数求解方法,最终结果为全局最优。

梯度下降法,是假设条件更为广泛(无约束)的,一种通过迭代更新来逐步进行的参数优化方法,最终结果为局部最优。

而我们通过调用Tensorflow计算梯度下降的函数tf.train.GradientDescentOptimizer来实现优化。

我们看下这个例子代码,只有30多行,逻辑还是很清晰的。

最终会得到一个接近2的值,比如我这次运行的值为1.9183811

线性模型:logistic回归

线性回归不过瘾,我们直接一步到位,开始进行手写识别。

我们采用深度学习三巨头之一的Yann Lecun教授的MNIST数据为例。 如上图所示,MNIST的数据是28x28的图像,并且标记了它的值应该是什么。

我们先看看数据是怎样从图片一步步转化为我们的预测的:

我们可以获取到的数据在编译器里是以矩阵形式存储的,如下:

teX为10000乘784的矩阵,teY为10000乘10的矩阵,10000表示例子的数目,784就是28x28个像素点,因为有10种不同的数字,所以teY的另一维度为10,每一维的值用来判断是否是该维对应的数字。teX,teY构成了训练集的数据。同理,trX,trY为测试集。

接下来要介绍的部分都只是模型构建的部分不同,大家可以参考上面数据的转化图片进行理解。

我们首先不管三七二十一,就用线性模型来做分类。

算上注释和空行,一共加起来30行左右,我们就可以解决手写识别这么困难的问题啦!请看代码:

经过100轮的训练,我们的准确率是92.36%。

无脑的浅层神经网络

用了最简单的线性模型,我们换成经典的神经网络来实现这个功能。

我们还是不管三七二十一,建立一个隐藏层,用最传统的sigmoid函数做激活函数。sigmoid的数学形式如下:

其核心逻辑还是矩阵乘法,这里面没有任何技巧。

    h = tf.nn.sigmoid(tf.matmul(X, w_h)) 
       return tf.matmul(h, w_o)

完整代码如下,仍然是40多行,不长:

第一轮运行,我这次的准确率只有69.11% ,第二次就提升到了82.29%。跑100轮的最终结果是95.41%,比Logistic回归的强!

请注意我们模型的核心那两行代码,完全就是无脑地全连接做了一个隐藏层而己,这其中没有任何的技术。完全是靠神经网络的模型能力。

深度学习时代方案 - ReLU和Dropout

我们将sigmoid函数换成ReLU函数。

线性整流函数(Rectified Linear Unit, ReLU),又称修正线性单元, 是一种人工神经网络中常用的激活函数(activation function),通常指代以斜坡函数及其变种为代表的非线性函数。

当然,Dropout也是要做的,Dropout可以比较有效地减轻过拟合的发生,一定程度上达到了正则化的效果。于是我们还是一个隐藏层,写个更现代一点的模型吧:

    X = tf.nn.dropout(X, p_keep_input)

    h = tf.nn.relu(tf.matmul(X, w_h)) 

    h = tf.nn.dropout(h, p_keep_hidden) 

    h2 = tf.nn.relu(tf.matmul(h, w_h2))

    h2 = tf.nn.dropout(h2, p_keep_hidden) 

    return tf.matmul(h2, w_o)

除了ReLU和dropout这两个技巧,我们仍然只有一个隐藏层,表达能力没有太大的增强。并不能算是深度学习。

从结果看到,第二次就达到了96%以上的正确率。后来就一直在98.4%左右游荡。仅仅是ReLU和Dropout,就把准确率从95%提升到了98%以上。

卷积神经网络出场

接下来,真正的深度学习利器CNN,卷积神经网络出场。这次的模型比起前面几个无脑型的,的确是复杂一些。涉及到卷积层和池化层。

我们看下这次的运行数据:

    0 0.95703125
    1 0.9921875
    2 0.9921875
    3 0.98046875
    4 0.97265625
    5 0.98828125
    6 0.99609375

在第6轮的时候,就跑出了99.6%的高分值,比ReLU和Dropout的一个隐藏层的神经网络的98.4%大大提高。因为难度是越到后面越困难。

在第16轮的时候,竟然跑出了100%的正确率:

    7 0.99609375
    8 0.99609375
    9 0.98828125
    10 0.98828125
    11 0.9921875
    12 0.98046875
    13 0.99609375
    14 0.9921875
    15 0.99609375
    16 1.0

借助Tensorflow和机器学习工具,我们只有几十行代码,就解决了手写识别这样级别的问题,而且准确度可以达到如此程度。

模型结果展示

说了这么多模型,我们来做个对比:

模型实践显神威

我们再用手写的图片试验一下模型的效果,手写图片如下:

图片处理的方式如下:

    import numpy as np

    from PIL import Image

    img=Image.open(r'图片文件路径').convert('L')

    # resize的过程

    if img.size[0] != 28 or img.size[1] != 28:

        img = img.resize((28, 28))

    # Staging a one-dimensional array of pixel values

    arr = []

    for i in range(28):

        for j in range(28):

            # Mnist where 0 represents the color is white (background), 1.0 for black

            pixel = 1.0 - float(img.getpixel((j, i)))/255.0

            # Pixel = 255.0 - float (img.getpixel ((j, i))) # If color 0-255

            arr.append(pixel)

    arr1 = np.array (arr) .reshape ((1, 28, 28, 1)) # arr1 is the input image data model

We can see the picture imported into the compiler in order to store a matrix, the matrix inside the numbers represent each pixel.

The output respectively [2] and [3], successful prediction! Description trained digital model identification ability is still very strong.

I heard recently Pa Pa Apple has filed a new patent - in real-time handwriting recognition technology.

Gospel handwritten electronic notes ah!

Even Apple are working on the technology, we can show a little, is not super happy !
To directly download the code file, pay attention to our public No. Oh! View history messages can be!

Guess you like

Origin www.cnblogs.com/dengfaheng/p/10959153.html