深度学习入门1

梯度下降(SSE)

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

反向传播

在这里插入图片描述

正则化

在这里插入图片描述
权重大,梯度下降过快,容易过拟合,
在这里插入图片描述
加一个对高weights的惩罚项,lambda为惩罚项系数

L1,L2正则

L1正则:

  1. 具有稀疏性,可降低weights,让较小的weights趋于0,得到较小的值
  2. 具有特征选择的功能,将不重要的特征置为0

L2正则:

  1. 不具有稀疏性,保证所有权值都较小(倾向于选择平方和更小的权值)
  2. 一般训练模型具有更好的效果
    在这里插入图片描述

Dropout

给定参数,即每个训练过程中随机关闭每个节点的概率,防止weights过大的节点对网络训练产生过大影响,而weights小的节点难以得到有效训练

局部最小值

  1. 随机初始化重新开始

梯度消失

sigmoid函数在weight较大时,梯度近似于0,可以换一个激活函数:1. RELU; 2. tanh

batch和随机梯度下降

数据集过多时,可以将数据分为多个batch,采用随机梯度下降的方式,多个不精确的step效果好于一个精确的step

momentum

momentum是(0,1)之间的常量β,与step的关系如下:距离当前越近的step权值越大,越远的越小,为1/β
在这里插入图片描述

扫描二维码关注公众号,回复: 11518230 查看本文章

实战项目:预测共享单车项目

在这里插入图片描述

class NeuralNetwork(object):
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Initialize weights
        self.weights_input_to_hidden = np.random.normal(0.0, self.input_nodes**-0.5, 
                                       (self.input_nodes, self.hidden_nodes))

        self.weights_hidden_to_output = np.random.normal(0.0, self.hidden_nodes**-0.5, 
                                       (self.hidden_nodes, self.output_nodes))
        self.lr = learning_rate
        
        #### TODO: Set self.activation_function to your implemented sigmoid function ####
        #
        # Note: in Python, you can define a function with a lambda expression,
        # as shown below.
        self.activation_function = lambda x : 1/(1+np.exp(-x))  # Replace 0 with your sigmoid calculation.
        
        ### If the lambda code above is not something you're familiar with,
        # You can uncomment out the following three lines and put your 
        # implementation there instead.
        #
        #def sigmoid(x):
        #    return 0  # Replace 0 with your sigmoid calculation here
        #self.activation_function = sigmoid
                    
    
    def train(self, features, targets):
        ''' Train the network on batch of features and targets. 
        
            Arguments
            ---------
            
            features: 2D array, each row is one data record, each column is a feature
            targets: 1D array of target values
        
        '''
        n_records = features.shape[0]
        delta_weights_i_h = np.zeros(self.weights_input_to_hidden.shape)
        delta_weights_h_o = np.zeros(self.weights_hidden_to_output.shape)
        for X, y in zip(features, targets):
            #### Implement the forward pass here ####
            ### Forward pass ###
            # TODO: Hidden layer - Replace these values with your calculations.
            hidden_inputs = np.dot(X,self.weights_input_to_hidden) # signals into hidden layer
            hidden_outputs = self.activation_function(hidden_inputs) # signals from hidden layer

            # TODO: Output layer - Replace these values with your calculations.
            final_inputs =  np.dot(hidden_outputs,self.weights_hidden_to_output) # signals into final output layer
            final_outputs = final_inputs # signals from final output layer
            
            #### Implement the backward pass here ####
            ### Backward pass ###

            # TODO: Output error - Replace this value with your calculations.
            error = y-final_outputs # Output layer error is the difference between desired target and actual output.
            
            # TODO: Calculate the hidden layer's contribution to the error
            hidden_error = np.dot(self.weights_hidden_to_output,error)
            
            # TODO: Backpropagated error terms - Replace these values with your calculations.
            output_error_term = error * 1
            hidden_error_term =hidden_error * hidden_outputs * (1-hidden_outputs)

            # Weight step (input to hidden)
            delta_weights_i_h += hidden_error_term * X[:,None]
            # Weight step (hidden to output)
            delta_weights_h_o += output_error_term * hidden_outputs[:,None]

        # TODO: Update the weights - Replace these values with your calculations.
        self.weights_hidden_to_output += self.lr * delta_weights_h_o/n_records # update hidden-to-output weights with gradient descent step
        self.weights_input_to_hidden += self.lr * delta_weights_i_h/n_records # update input-to-hidden weights with gradient descent step

猜你喜欢

转载自blog.csdn.net/wezard95/article/details/106101711