[Machine Learning] P15 Tensorflow Neural Network Logical Understanding, Establishing a Logical Framework Outline

insert image description here

Review the logical flow of training a neural network

insert image description here

.1 Create neurons

For each neuron:
fw ⃗ , b ( x ⃗ ) = w ⃗ ⋅ x ⃗ + b f_{\vec{w},b}(\vec{x}) = \vec{w}\vec{x } + bfw ,b(x )=w x +b

For example, the first neuron in layer 1:
a ⃗ 1 [ 1 ] = g ( fw ⃗ 1 [ 1 ] , b 1 [ 1 ] ( x ⃗ ) ) \vec{a}^{[1]}_1 = g(f_{\vec{w}^{[1]}_1,b^{[1]}_1}(\vec{x}))a 1[1]=g(fw 1[1],b1[1](x ))

Related blog post links:

[Machine Learning] P11 Neural Network
[Machine Learning] P12 Forward Propergation
[Machine Learning] P14 Tensorflow User Guide Dense Sequential Tensorflow Implementation


.2 Calculation of loss value

In a neural network, the loss value refers to the difference between the predicted value and the actual value of the output layer of each training sample:
loss = − ylog ( fw ⃗ , b ( x ⃗ ) ) − ( 1 − y ) log ( 1 − fw ⃗ , b ( x ⃗ ) ) loss = -ylog(f_{\vec{w},b}(\vec{x})) - (1-y)log(1-f_{\vec{w}, b}(\vec{x}))loss=ylog(fw ,b(x ))(1y)log(1fw ,b(x ))

The sum of the loss values ​​of all training samples is:
J ( w ⃗ , b ) = 1 m ∑ i = 1 mloss J(\vec{w},b) = \frac 1 m \sum ^{m} _{i= 1} lossJ(w ,b)=m1i=1mloss

Related blog post links:

[Machine Learning] P6 Logistic Regression Loss Function and Gradient Descent


.3 Gradient descent training model

Through the gradient descent method, continuously optimize the parameters of the training model ( w ⃗ , b \vec{w},bw ,b ), so that the loss valueJ ( w ⃗ , b ) J(\vec{w},b)J(w ,b ) Minimal, i.e. the prediction is as accurate as possible:

Update w ⃗ \vec{w}w
wj = wj − α ∂ J ( w ⃗ , b ) ∂ wj w_j = w_j - \alpha \frac {\partial J(\vec{w},b)}{\partial w_j}wj=wjawjJ(w ,b)

update bbb
b = b − α ∂ J ( w ⃗ , b ) ∂ b b = b - \alpha \frac {\partial J(\vec{w},b)} {\partial b} b=babJ(w ,b)

其中:
∂ J ( w ⃗ , b ) ∂ w j = 1 m ∑ i = 0 m − 1 ( f w ⃗ , b ( x ⃗ [ i ] ) − y [ i ] ) x j [ i ] ∂ J ( w ⃗ , b ) ∂ b = 1 m ∑ i = 0 m − 1 ( f w ⃗ , b ( x ⃗ [ i ] ) − y [ i ] ) \frac {\partial J(\vec{w},b)} {\partial w_j} = \frac 1 m \sum ^{m-1} _{i=0}(f_{\vec{w},b}(\vec{x}^{[i]})-y^{[i]})x^{[i]}_j \\ \frac {\partial J(\vec{w},b)} {\partial b} = \frac 1 m \sum ^{m-1} _{i=0}(f_{\vec{w},b}(\vec{x}^{[i]})-y^{[i]}) wjJ(w ,b)=m1i=0m1(fw ,b(x [i])y[i])xj[i]bJ(w ,b)=m1i=0m1(fw ,b(x [i])y[i])

The specific neural network update method is called back propergation, which will be described in detail in subsequent blog posts. The link is as follows:
xxxxxxxx


Python implementation

Related blog posts: [Machine Learning] P9 implements a logistic regression case from beginning to end

.1 Create neurons

def sigmoid(z):
	f_x = 1/(1 + np.exp(-z))
	return f_x

z = np.dot(w,x) + b
f_x = sigmoid(z)

.2 Calculation of loss value

def compute_cost(X,y,w,b):
	m = X.shape[0]
	cost = 0.
	for i in range(m):
		f_x_i = sigmoid(np.dot(w,X[i]) + b)
		loss = -y * np.log(f_x_i) - (1 - y) * np.log(1 - f_x_i)
		cost += loss
	cost = cost / m
	
	return cost

.3 Gradient descent training model

def compute_gradient(X,y,w,b):
	m = X.shape[0]
	dj_dw = np.zeros(w.shape)
	dj_db = 0.

	for i in range(m):
		f_wb = sigmoid(np.dot(w, X[i]) + b)
		cost = f_wb - y[i]

		dj_db += cost
		dj_dw += cost * X[i]

	dj_dw = dj_dw / m
	dj_db = dj_db / m

	return dj_dw, dj_db
def gradient_descent(X,y,w_in,b_in,cost_function,gradient_function,alpha,num_iters):

    m = len(X)
  
    J_history = []		
    w_history = []		
    
    for i in range(num_iters):
        dj_db, dj_dw = gradient_function(X, y, w_in, b_in)   

        w_in = w_in - alpha * dj_dw     # alpha 为学习率          
        b_in = b_in - alpha * dj_db              
       
        if i<100000:
            cost =  cost_function(X, y, w_in, b_in)
            J_history.append(cost)

        if i% math.ceil(num_iters/10) == 0 or i == (num_iters-1):
            w_history.append(w_in)
            print(f"Iteration {
      
      i:4}: Cost {
      
      float(J_history[-1]):8.2f}   ")
        
    return w_in, b_in, J_history, w_history	

Tensorflow implementation

.1 Create neurons

model = Sequential([
	Dense(units = 25, activation="sigmoid"),
	Dense(units = 15, activation="sigmoid"),
	Dense(units = 1, activation="sigmoid")
])

.2 Calculation of loss value

model.compile(
	loss = BinaryCrossentropy()
)

.3 Gradient descent training model

model.fit(X,y,epochs=100)

Guess you like

Origin blog.csdn.net/weixin_43098506/article/details/129939865