Simple TensorFlow Classification Tutorial

NVIDIA DLI Deep Learning Introductory Training | Three special sessions! !

April 28th/May 19th/May 26th One- 640?wx_fmt=jpeg&wxfrom=5&wx_lazy=1 day intensive study to get you started quickly Read more >


The main text has a total of 7797 words, 13 pictures, and the estimated reading time is 18 minutes.


This article has 2 topics, Simple Classifier and TensorFlow. First, we'll write functions to generate three categories of simulated data. The first set of data is linearly separable, the second is the crescent-shaped data snapped together, and the third is the Saturn ring data. Each set of data has two types, and we will build models separately to classify each set of data.


All the code of this article is in ML-tutorial. (https://github.com/Aspirinkb/ML-tutorial)


The linearly separable data are as follows:

640?wx_fmt=png&wxfrom=5&wx_lazy=1

linear data


The data for the crescent shape is as follows:

640?wx_fmt=png

moon data


The ring data is as follows:

640?wx_fmt=png

saturn data


Obviously, the first set of data only needs a straight line (high-dimensional data is a hyperplane) to meet the classification requirements, so we will build a SoftMax regression classification model later. The second set of data cannot be divided by a straight line or a hyperplane. Similarly, if the third set of data is divided by a straight line, it is best to obtain a 50% correct rate. For the second and third sets of data, we use a simple neural network model to learn hypersurfaces to classify different data categories. The use of TensorFlow will be introduced in the process of building the three models to reduce the resistance of learning.


Generate mock data


The method of generating a simulated data set is very simple. Use equations such as sine and circle to generate regular data, and then add some random disturbance to simulate noise. All functions only use Numpy. For example, the generate_Saturn_data() method first uses the np.linspace() method to generate the angle change, then determines the center of the circle, and then generates two ranges of radii, which are used to generate the inner ring data and the outer ring data, which are shaped like Saturn and his satellite belt. code show as below:


def generate_Saturn_data(N=100):
   theta = np.linspace(0, 2*PI, N) + PI*(np.random.rand(N))/100
   a = 0.5
   b = 0.5
   r1 = 0.4 + 2*(np.random.rand(N)-0.5)/10
   x1 = a + r1*np.cos(theta) + (np.random.rand(N)-0.5)/50
   y1 = b + r1*np.sin(theta) + (np.random.rand(N)-0.5)/50
   r2 = 0.2*np.random.rand(N)
   x2 = a + r2*np.cos(theta) + (np.random.rand(N)-0.5)/50
   y2 = b + r2*np.sin(theta) + (np.random.rand(N)-0.5)/50

   return x1, y1, x2, y2


In the code, x1 and y1 constitute samples of the first category, and x2 and y2 are samples of the second category. Therefore, the label of (x1, y1) is marked as 0, and the label of (x2, y2) is marked as 1, that is, the category with id 0 and the category with id 1. 0s and 1s are just representatives of two related classes in the real world, such as cars and pedestrians. Note that the parameter N is the number of samples per class.


The gen_data() method is responsible for assembling the simulated data generated above into training data sets and test data sets. The annotation of each sample adopts the One-Hot encoding format supported by TensorFlow. For example, the first class of data samples (x1, y1) should be labeled (1, 0), while the second class of data samples (x2, y2) should be labeled (0, 1). In order to meet the characteristics of stochastic gradient descent, the data is also rearranged within the gen_data() method, and the number of samples in each category is also considered.


Linear model


For linearly separable data, only the SoftMax regression model is needed.


tf.reset_default_graph()
x = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='samples')
y = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='labels')

W = tf.Variable(tf.zeros(shape=(2,2)), name='weight')
b = tf.Variable(tf.zeros(shape=(2)), name='bias')

pred = tf.nn.softmax(tf.matmul(x, W) + b, name='pred')


In the above code, the graph of TensorFlow is first reset to prevent conflicting graphs in the environment. It is also possible to instantiate a new graph and bind it to a subsequent session, but we stick to the default graph here.


模型的输入是x和y,均为tf.placeholder类型,相当于占位符,只在训练或推理(只需要x)的时候,真正绑定具体的输入。x表示输入的样本,注意每个样本有两个数值,因此x的shape是[None, 2]。同理,样本只有两种类型,因此one-hot编码后的标注y也是shape为[None, 2]的Tensor。


接下来,W和b是模型的参数,经过训练,不断修正。这类型的数据,在TensorFlow中为tf.Variable类型,同时也给定了W和b的初始化方法,即全部初始化为0.0。这里,需要特别注意,使用0.0初始化权重是有风险的,很容易使模型陷于某个非最优的鞍点,导致无法优化模型。典型的特点是,不管训练多少轮,loss只在最初有下降,训练一定轮数后就无法继续下降。正确率很低,但却不再提升,一种过拟合的状态。如下图:


正确率曲线:

640?wx_fmt=png

accuracy1


loss曲线:

640?wx_fmt=png

cost1


通常,这个时候最有效但很容易被忽略的的方法可能就是改变初始化方法,例如使用随机正态初始化tf.random_normal()。当然,改变激活函数、loss函数、权重更新策略、学习率等都可能会产生作用。本文建立的所有模型,你都可以自己尝试修改一些地方,看看能不能得到更高的正确率、更快的学习速度。不要害怕出错,尽管试验,你会学到一些无法传授的知识,这些知识只能通过实践产生,所谓实践出真知吧。下面的accuracy曲线和loss曲线就是修改了初始化方法后的效果。


640?wx_fmt=png

accuracy2


640?wx_fmt=png

cost2


设定好模型结构之后,还需要设计损失函数和优化方法。损失函数直接定义了模型的学习目标,设置的恰当合理,有助于提升学习速度和正确率,这很大一部分取决于我们对整个问题的理解和学习过程的把握。鉴于这只是一组简单的线性数据,使用均方差和随机梯度下降就可以了。注意,我们的第一个问题全局只有一个最优解,所以只要学习时间够长,总是可以得到很高(甚至100%)的正确率。有两个概念要解释一下,下面代码中的epoch和step。因为要进行随机梯度下降,我们需要不断迭代,把整个训练集的样本分成多个step,逐个送入模型计算误差和梯度、更新权重。每个epoch完毕,正好训练集的所有样本被迭代一遍。


# for traincost = tf.reduce_mean(tf.square(y-pred))
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(cost)
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accurary = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
init_op = tf.global_variables_initializer()

saver = tf.train.Saver()with tf.Session() as sess:
   tf.summary.scalar('cost', cost)
   tf.summary.histogram('weight', W)
   tf.summary.scalar('accurary', accurary)
   merged = tf.summary.merge_all()
   train_writer = tf.summary.FileWriter('./log/linear_model/train', sess.graph)
   test_writer = tf.summary.FileWriter('./log/linear_model/test', sess.graph)
   sess.run(init_op)
   x_train, y_train = data_linear['train_set']
   x_test, y_test = data_linear['test_set']
   num_samples = len(x_train)    for epoch in range(epochs):
       steps = int(num_samples / batch_size)
       indices = np.random.permutation(num_samples)
       x_train_ = x_train[indices]
       y_train_ = y_train[indices]        for step in range(steps):
           start = step*batch_size
           end = start + batch_size
           x_ = x_train_[start:end,:]
           y_ = y_train_[start:end,:]
           summary, _, c = sess.run([merged, train, cost], feed_dict={x:x_, y:y_})
           train_writer.add_summary(summary)        if epoch%100 == 99:
           summary, acc = sess.run([merged, accurary], feed_dict={x:x_test, y:y_test})
           test_writer.add_summary(summary, epoch)
           print("Epoch:{:5d}, Accurary:{:.2f}".format(epoch, acc))
   print('W:', W.eval())
   print('b:', b.eval())
   train_writer.close()
   test_writer.close()
   print("Training Finished!")
   save_path = saver.save(sess, './log/linear_model/linear_model.ckpt')

   print('model saved in path: ', save_path)


为了使用TensorBoard可视化学习过程,例如监控Accuracy的变化、loss的变化。TensorFlow和TensorBoard为开发者提供了很多功能。使用方法是


1、实例化tf.summary.FileWriter()

2、把需要监控的参数加入到队列中,标量用tf.summary.scalar,张量用tf.summary.histogram

3、合并所有监控的结点到graph上,建立依赖关系merged = tf.summary.merge_all()

4、调用filewriter的add_summary()

5. Start tensorboard in the terminal tensorboard --logdir=...


Finally, the above curve can be seen in the browser.


After training, visualize the decision results of the model

640?wx_fmt=png

linear


The values ​​of the parameters W and b are as follows: (If you do it yourself, the results will vary, because the data is random and the initialization is random, but the overall appearance of the sample will not change greatly, so the final parameters will be similar. )


640?wx_fmt=png

WB


As you can see, what we finally learned is actually a straight line.


polynomial-like model


A simple neural network classifier model with two hidden layers, the first hidden layer has 32 neurons, the activation function is tanh, the second hidden layer has 8 neurons, and the activation function is also tanh.


The next loss function uses the cross-entropy function.


x = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='samples')
y = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='labels')

W1 = tf.Variable(tf.random_normal(shape=(2,32), mean=0.0, stddev=1), name='weight1')
b1 = tf.Variable(tf.zeros(shape=[32]), name='bias1')
W2 = tf.Variable(tf.random_normal(shape=(32,8)), name='weight2')
b2 = tf.Variable(tf.zeros(shape=[8]), name='bias2')
W3 = tf.Variable(tf.random_normal(shape=(8,2)), name='weight3')
b3 = tf.Variable(tf.zeros(shape=[2]), name='bias3')
z = tf.matmul(x, W1) + b1
layer1 = tf.tanh(z, name='layer1')
z = tf.matmul(layer1, W2) + b2
layer2 = tf.tanh(z, name='layer2')
out = tf.matmul(layer2, W3) + b3
pred = tf.nn.softmax(out, name='pred')# for traincost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y, logits=out))
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(cost)# for testcorrect_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))

accurary = tf.reduce_mean(tf.cast(correct_pred, tf.float32))


Finally, the classification effect of the model is roughly as follows:


640?wx_fmt=png

moon


circle-like model


The third set of data is ring data, in order to get a circle-like classification boundary, we need to increase the number of hidden layers of the neural network, a neural network classifier with four hidden layers.


x = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='samples')
y = tf.placeholder(dtype=tf.float32, shape=(None, 2), name='labels')

W1 = tf.Variable(tf.random_normal(shape=(2,3), mean=0.0, stddev=1), name='weight1')
b1 = tf.Variable(tf.zeros(shape=(3)), name='bias1')
W2 = tf.Variable(tf.random_normal(shape=(3,6)), name='weight2')
b2 = tf.Variable(tf.zeros(shape=(6)), name='bias2')
W3 = tf.Variable(tf.random_normal(shape=(6,9)), name='weight3')
b3 = tf.Variable(tf.zeros(shape=(9)), name='bias3')
W4 = tf.Variable(tf.random_normal(shape=(9,2)), name='weight4')
b4 = tf.Variable(tf.zeros(shape=(2)), name='bias4')
z = tf.matmul(x, W1) + b1# layer1 = tf.nn.relu(z, name='layer1')# layer1 = tf.tanh(z, name='layer1')layer1 = tf.tanh(z, name='layer1')
z = tf.matmul(layer1, W2) + b2
layer2 = tf.tanh(z, name='layer2')
z = tf.matmul(layer2, W3) + b3
layer3 = tf.tanh(z, name='layer3')
out = tf.matmul(layer3, W4) + b4
pred = tf.nn.softmax(out, name='pred')# for traincost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y, logits=out))
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(cost)# for testcorrect_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))

accurary = tf.reduce_mean(tf.cast(correct_pred, tf.float32))


最后,分类结果如下:


640?wx_fmt=png

saturn2


Of course, the modeling process is definitely not smooth, and we often need to constantly debug the structure of the model and the setting of hyperparameters according to the learning process. For example, accidentally learned the following results. . .


640?wx_fmt=png

saturn1


Don't lose heart, keep the courage to move forward, this is deep learning.


refer to


Simple end-to-end TensorFlow examples
Implementing a Neural Network from Scratch in Python – An Introduction
Implementing a Neural Network from Scratch in Python - Source Code


Original link: https://www.jianshu.com/p/52e5cdd44f9c


For more concise and convenient classified articles and the latest course and product information, please move to the newly presented "LeadAI Academy Official Website":

www.leadai.org


Please pay attention to the artificial intelligence LeadAI public account to view more professional articles

640?wx_fmt=jpeg

Everyone is watching

640.png?

Application of LSTM model in question answering system

TensorFlow-based neural network solves the problem of user churn overview

The Most Common Algorithm Engineer Interview Questions (1)

Sorting out the most common algorithm engineer interview questions (2)

TensorFlow from 1 to 2 | Chapter 3 The Beginning of the Deep Learning Revolution: Convolutional Neural Networks

Decorators | Advanced Programming in Python

Why don't you review the basics of Python today?

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326166749&siteId=291194637