TensorFlow2.0 study notes 2.4: loss function

Loss function:

Insert picture description hereThe loss function is the difference between the result y calculated by the forward propagation and the known standard answer y_

The optimization goal of the neural network is to find a certain set of parameters so that the calculated result y is infinitely close to the known standard answer y_, that is, the difference between their loss value is the smallest.
There are three calculation methods for mainstream loss:
average error (mse)
custom
Cross entropy (ce)

1. Equivalence error

It is the square of the difference between the result y of the forward propagation calculation and the known standard answer y_. Find the average again.
Example 1:
I separate the code

import tensorflow as tf
import numpy as np

SEED = 23455

rdm = np.random.RandomState(seed=SEED)  # 生成[0,1)之间的随机数
x = rdm.rand(32, 2) # 生成32行2列的输入特征x,包含了32组0到1之间的随机数x1和x2
print(x)
y_ = [[x1 + x2 + (rdm.rand() / 10.0 - 0.05)] for (x1, x2) in x]
print(y_)
x = tf.cast(x, dtype=tf.float32) # x转变数据类型
print(x)
w1 = tf.Variable(tf.random.normal([2, 1], stddev=1, seed=1)) # 随机初始化参数w1,初始化为2行1列,并设置标准差是1
print(w1)

First generate random data and convert it to tensor.
Output result:
Insert picture description hereInsert picture description here

for epoch in range(epoch):
    with tf.GradientTape() as tape:
        y = tf.matmul(x, w1) # 求前向传播计算结果y, 矩阵x乘以矩阵w1
        loss_mse = tf.reduce_mean(tf.square(y_ - y)) # 求均方误差损失函数loss_mse

    grads = tape.gradient(loss_mse, w1) # 损失函数对带训练参数w1求偏导
    w1.assign_sub(lr * grads) # 更新参数w1

    if epoch % 500 == 0: # 每迭代500轮,打印当前的参数w1
        print("After %d training steps,w1 is " % (epoch))
        print(w1.numpy(), "\n")
print("Final w1 is: ", w1.numpy())

with tf.GradientTape() as tape Gradient with Tensorflow automatic derivative API
Insert picture description here

2. Custom loss function:

Insert picture description here

import tensorflow as tf
import numpy as np

SEED = 23455
COST = 1
PROFIT = 99

rdm = np.random.RandomState(SEED)
x = rdm.rand(32, 2)
y_ = [[x1 + x2 + (rdm.rand() / 10.0 - 0.05)] for (x1, x2) in x]  # 生成噪声[0,1)/10=[0,0.1); [0,0.1)-0.05=[-0.05,0.05)
x = tf.cast(x, dtype=tf.float32)

w1 = tf.Variable(tf.random.normal([2, 1], stddev=1, seed=1))

epoch = 10000
lr = 0.002

for epoch in range(epoch):
    with tf.GradientTape() as tape:
        y = tf.matmul(x, w1)
        loss = tf.reduce_sum(tf.where(tf.greater(y, y_), (y - y_) * COST, (y_ - y) * PROFIT))#让预测的y多了时损失成本,预测的y少了是损失利润

    grads = tape.gradient(loss, w1)
    w1.assign_sub(lr * grads)

    if epoch % 500 == 0:
        print("After %d training steps,w1 is " % (epoch))
        print(w1.numpy(), "\n")
print("Final w1 is: ", w1.numpy())

# 自定义损失函数
# 酸奶成本1元, 酸奶利润99元
# 成本很低,利润很高,人们希望多预测些,生成模型系数大于1,往多了预测

The fitted function is sales y=1.16 X1+1.12 X2, the coefficients are all too large, and they are all larger than the coefficient of the mean square error as the loss function
Insert picture description here

3. Cross entropy loss function:

Insert picture description hereCross entropy can represent the distance between two probability distributions. Cross entropy, the farther the two probability distributions are, the smaller the cross entropy is, and the closer the two probability distributions are. The probability distribution
y_(1, 0) of the
standard answer has two elements, indicating two classifications.
The first element is 1, which means that the probability of the first situation is 100%
. The first element is 0, which means that the probability of the first situation is 0. The
neural network predicts two sets of probabilities y1 and y2. The probability corresponds to y_ When
Insert picture description herewe perform the classification problem, we usually use the softmax function first to make the output result conform to the probability distribution, and then find the cross-entropy loss function to
understand the softmax function in one minute (super simple)
TensorFlow gives a way to calculate the probability distribution and The function of cross entropy
tf.nn.softmax_cross_entropy_with_logits(y_, y)
Insert picture description here
As can be seen from this example, the sentence loss_ce2 can replace the two sentences y_pro and loss_ce1, and the calculation of probability distribution and cross entropy can be completed at one time

Guess you like

Origin blog.csdn.net/weixin_44145452/article/details/112995057