深度学习

梯度下降（SSE）
反向传播
正则化

L1，L2正则

Dropout
局部最小值
梯度消失
batch和随机梯度下降
momentum
实战项目:预测共享单车项目

梯度下降（SSE）

在这里插入图片描述

反向传播

在这里插入图片描述

正则化

在这里插入图片描述
权重大，梯度下降过快，容易过拟合，

加一个对高weights的惩罚项，lambda为惩罚项系数

L1，L2正则

L1正则：

具有稀疏性，可降低weights，让较小的weights趋于0，得到较小的值
具有特征选择的功能，将不重要的特征置为0

L2正则：

不具有稀疏性，保证所有权值都较小（倾向于选择平方和更小的权值）
一般训练模型具有更好的效果

Dropout

给定参数，即每个训练过程中随机关闭每个节点的概率，防止weights过大的节点对网络训练产生过大影响，而weights小的节点难以得到有效训练

局部最小值

随机初始化重新开始

梯度消失

sigmoid函数在weight较大时，梯度近似于0，可以换一个激活函数：1. RELU; 2. tanh

batch和随机梯度下降

数据集过多时，可以将数据分为多个batch，采用随机梯度下降的方式，多个不精确的step效果好于一个精确的step

momentum

momentum是（0,1）之间的常量β，与step的关系如下：距离当前越近的step权值越大，越远的越小，为1/β
在这里插入图片描述

扫描二维码关注公众号，回复： 11518230 查看本文章

实战项目:预测共享单车项目

在这里插入图片描述

class NeuralNetwork(object):
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Initialize weights
        self.weights_input_to_hidden = np.random.normal(0.0, self.input_nodes**-0.5, 
                                       (self.input_nodes, self.hidden_nodes))

        self.weights_hidden_to_output = np.random.normal(0.0, self.hidden_nodes**-0.5, 
                                       (self.hidden_nodes, self.output_nodes))
        self.lr = learning_rate
        
        #### TODO: Set self.activation_function to your implemented sigmoid function ####
        #
        # Note: in Python, you can define a function with a lambda expression,
        # as shown below.
        self.activation_function = lambda x : 1/(1+np.exp(-x))  # Replace 0 with your sigmoid calculation.
        
        ### If the lambda code above is not something you're familiar with,
        # You can uncomment out the following three lines and put your 
        # implementation there instead.
        #
        #def sigmoid(x):
        #    return 0  # Replace 0 with your sigmoid calculation here
        #self.activation_function = sigmoid
                    
    
    def train(self, features, targets):
        ''' Train the network on batch of features and targets. 
        
            Arguments
            ---------
            
            features: 2D array, each row is one data record, each column is a feature
            targets: 1D array of target values
        
        '''
        n_records = features.shape[0]
        delta_weights_i_h = np.zeros(self.weights_input_to_hidden.shape)
        delta_weights_h_o = np.zeros(self.weights_hidden_to_output.shape)
        for X, y in zip(features, targets):
            #### Implement the forward pass here ####
            ### Forward pass ###
            # TODO: Hidden layer - Replace these values with your calculations.
            hidden_inputs = np.dot(X,self.weights_input_to_hidden) # signals into hidden layer
            hidden_outputs = self.activation_function(hidden_inputs) # signals from hidden layer

            # TODO: Output layer - Replace these values with your calculations.
            final_inputs =  np.dot(hidden_outputs,self.weights_hidden_to_output) # signals into final output layer
            final_outputs = final_inputs # signals from final output layer
            
            #### Implement the backward pass here ####
            ### Backward pass ###

            # TODO: Output error - Replace this value with your calculations.
            error = y-final_outputs # Output layer error is the difference between desired target and actual output.
            
            # TODO: Calculate the hidden layer's contribution to the error
            hidden_error = np.dot(self.weights_hidden_to_output,error)
            
            # TODO: Backpropagated error terms - Replace these values with your calculations.
            output_error_term = error * 1
            hidden_error_term =hidden_error * hidden_outputs * (1-hidden_outputs)

            # Weight step (input to hidden)
            delta_weights_i_h += hidden_error_term * X[:,None]
            # Weight step (hidden to output)
            delta_weights_h_o += output_error_term * hidden_outputs[:,None]

        # TODO: Update the weights - Replace these values with your calculations.
        self.weights_hidden_to_output += self.lr * delta_weights_h_o/n_records # update hidden-to-output weights with gradient descent step
        self.weights_input_to_hidden += self.lr * delta_weights_i_h/n_records # update input-to-hidden weights with gradient descent step

深度学习入门1

深度学习

梯度下降（SSE）

反向传播

正则化

L1，L2正则

Dropout

局部最小值

梯度消失

batch和随机梯度下降

momentum

实战项目:预测共享单车项目

猜你喜欢