1 Introduction
Machine Learning (Ma Chine Learning) There are many classic algorithms, based on the depth of the depth of the neural network learning algorithm of the most sought after, mainly because of its defeat because Li Shishi Alpha Dog of the algorithms based on neural network is actually deep learning algorithm. This article first introduces the basic neuron, then a simple Perceptron, extended to multilayer neural networks, multi-layer feedforward neural network, other common neural network, and then introduced based on the depth deep learning neural network, the paper come Zhongjue shallow, and finally with python own language paper to write a feedforward neural network layers.
2 neuron model
This is the basic unit of the biological brain - neurons.
In this model, neurons receive from n other neuronal transmission over the input signal, connected to be passed, in which the input signal by a weighted total input value of the neuron receives will be compared with the threshold of neurons, then the "activation function" to produce a processed output neuron.
3 and the multilayer perceptron network
After passing Perceptron neurons from two layers, an input layer receives the external input signal to the output layer, the output layer neurons MP.
To solve nonlinear separable problem, consider using a multilayer neuronal function. Each neuron and the lower layer neuron fully connected, there is no connection between the neurons of the same layer, there is no inter-layer connection. Such a neural network structure is generally referred to as "multilayer feedforward neural network."
Learning neural network is used to adjust the "right to connect" function of each neuron and between neurons threshold based on the training data; in other words, neural networks "learn" to something implication in connection weights and thresholds.
4 error back propagation algorithm
Network mean square error is on:
It is because of its strong ability to express, BP neural network frequently encountered fitting that the training process error continues to decrease, but the test error was likely to rise. There are two strategies used to relieve BP network overfitting. The first strategy is to "early stop": the data into a training set and a validation set, used to calculate the gradient of the training set, update the connection weights and thresholds. A second strategy is the "regularization", the basic idea is to increase the error in a part of the objective function is described for network complexity.
5 and local minimum global minimum
若用Ε表示神经网络在训练集上的误差,则它显然关于连接权ω和阈值$\theta$的函数。此时,神经网络的训练过程可看作一个参数寻优的过程,即在参数空间中,寻找最优参数使得E最小。
6 其他常见的神经网络
- RBF网络
- ART网络
- SOM网络
- 级联相关网络
- Elman网络
- Boltzmann机
7 深度学习
典型的深度学习就是很深层的神经网络,显然,对神经网络模型,提高容量的简单方法是增加隐层,隐层多了,相应的神经元连接权、阈值等参数就会更多。模型复杂度也可以通过单纯增加隐层神经元的数目来实现。
深度学习的多隐层堆叠,每层对上一层的输出进行处理的机制,可看作是在对输入信号进行逐层加工,从而把初始的,与输出目标之间联系不太密切的输入表示,转换为与输出目标联系更密切的表示,使得原来仅基于最后一层输出映射难以完成的任务成为可能。换言之,通过多层处理,逐渐将初始的“低层”特征表示转换为“高层”特征表示后,用“简单模型”即可完成复杂的分类任务,由此可将深度学习理解为进行“特征学习”或“表示学习”。
8 使用python制作神经网络
让我们勾勒神经网络类的大概样子,我们知道它应该至少有三个函数:
- 初始化函数——设置输入层节点、隐藏层节点和输出层节点
- 训练——学习给定训练集样本后,优化权重
- 查询——给定输入,从输出节点给出答案
# 神经网络类 class neuralNetwork: # 初始化网络 def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate): self.inodes = inputnodes self.hnodes = hiddennodes self.onodes = outputnodes #链接权重矩阵 self.wih = numpy.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes)) self.who = numpy.random.normal(0.0, pow(self.hnodes, -0.5), (self.onodes, self.hnodes)) # learning rate self.lr = learningrate self.activation_function = lambda x: 1/(1+numpy.exp(-x)) pass # 训练网络 def train(self, inputs_list, targets_list): # convert inputs list to 2d array inputs = numpy.array(inputs_list, ndmin=2).T targets = numpy.array(targets_list, ndmin=2).T # calculate signals into hidden layer hidden_inputs = numpy.dot(self.wih, inputs) # calculate the signals emerging from hidden layer hidden_outputs = self.activation_function(hidden_inputs) # calculate signals into final output layer final_inputs = numpy.dot(self.who, hidden_outputs) # calculate the signals emerging from final output layer final_outputs = self.activation_function(final_inputs) # output layer error is the (target - actual) output_errors = targets - final_outputs # hidden layer error is the output_errors, split by weights, recombined at hidden nodes hidden_errors = numpy.dot(self.who.T, output_errors) # update the weights for the links between the hidden and output layers self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs)) # update the weights for the links between the input and hidden layers self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs)) pass # 查询网络 def query(self, inputs_list): # convert inputs list to 2d array inputs = numpy.array(inputs_list, ndmin=2).T # calculate signals into hidden layer hidden_inputs = numpy.dot(self.wih, inputs) # calculate the signals emerging from hidden layer hidden_outputs = self.activation_function(hidden_inputs) # calculate signals into final output layer final_inputs = numpy.dot(self.who, hidden_outputs) # calculate the signals emerging from final output layer final_outputs = self.activation_function(final_inputs) return final_outputs
手写数字的数据集MNIST
training_data_file = open("mnist_dataset/mnist_train.csv", 'r') training_data_list = training_data_file.readlines() training_data_file.close()
训练数据集
epochs = 5 for e in range(epochs): # go through all records in the training data set for record in training_data_list: # split the record by the ',' commas all_values = record.split(',') # scale and shift the inputs # 输入值需要避免0,输出值需要避免1 inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01 # create the target output values (all 0.01, except the desired label which is 0.99) targets = numpy.zeros(output_nodes) + 0.01 # all_values[0] is the target label for this record targets[int(all_values[0])] = 0.99 n.train(inputs, targets) pass pass
测试数据集
test_data_file = open("mnist_dataset/mnist_test.csv", 'r') test_data_list = test_data_file.readlines() test_data_file.close() scorecard = [] # go through all the records in the test data set for record in test_data_list: # split the record by the ',' commas all_values = record.split(',') # correct answer is first value correct_label = int(all_values[0]) # scale and shift the inputs inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01 # query the network outputs = n.query(inputs) # the index of the highest value corresponds to the label label = numpy.argmax(outputs) # append correct or incorrect to list if (label == correct_label): # network's answer matches correct answer, add 1 to scorecard scorecard.append(1) else: # network's answer doesn't match correct answer, add 0 to scorecard scorecard.append(0) pass pass # calculate the performance score, the fraction of correct answers scorecard_array = numpy.asarray(scorecard) print ("performance = ", scorecard_array.su m() / scorecard_array.size)