Tensorflow cross entropy function: cross_entropy
Note: none of the logits in the input of the tensorflow cross entropy calculation function is the output of softmax or sigmoid, but the input of the softmax or sigmoid function, because it does the sigmoid or softmax operation inside the function
tf.nn.sigmoid_cross_entropy_with_logits(_sentinel=None,labels=None, logits=None, name=None)
argument:
_sentinel: Essentially a parameter that is not used, do not need to fill in
logits: a data type (type) is float32 or float64;
shape: [batch_size, num_classes], single sample is [num_classes]
labels: A tensor (tensor) with the same type (float) and shape as logits,
name: The name of the operation, optional
output:
loss,shape:[batch_size,num_classes]
Note:
It first calculates the input logits through the sigmoid function, and then calculates their cross entropy, but it optimizes the calculation method of the cross entropy so that the result will not overflow
It works well when each category is independent but not mutually exclusive: for example a picture can contain both a dog and an elephant
output is not a number, but the loss of each sample in a batch, so it is generally used with tf.reduce_mea(loss)
Calculation formula:
Python program:
importtensorflowas tf
import numpy asnp
def sigmoid(x):
return 1.0/(1+np.exp(-x))
# 5 samples three classification problem, and one sample can have multiple classes at the same time
y = np.array([[1,0,0],[0,1,0],[0,0,1],[1, 1,0],[0,1,0]]
logits = np.array([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]])
y_pred = sigmoid(logits)
E1 = -y*np.log(y_pred)-(1-y)*np.log(1-y_pred)
print(E1) # The result calculated by the calculation formula
sess =tf.Session()
y = np.array(y).astype(np.float64) # labels are the data type of float64
E2 = sess.run(tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits))
print(E2)
The output E1, E2 results are the same
tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None, dim=-1, name=None)
argument:
_sentinel: Essentially a parameter that is not used, do not need to fill in
logits: a data type (type) is float32 or float64;
shape:[batch_size,num_classes]
labels: A tensor with the same type and shape as logits, , is a valid probability, sum(labels)=1, one_hot=True (only one value in the vector is 1.0, the other values are 0.0)
name: The name of the operation, optional
output:
loss,shape:[batch_size]
Note:
It first calculates the input logits through the softmax function, and then calculates their cross entropy, but it optimizes the calculation method of the cross entropy so that the result does not overflow
It is suitable for the case where each category is independent and exclusive of each other, a picture can only belong to one category, and cannot contain a dog and an elephant at the same time
output is not a number, but the loss of each sample in a batch, so it is generally used with tf.reduce_mean(loss)
Calculation formula:
Python program:
importtensorflowas tf
import numpy asnp
def softmax(x):
sum_raw = np.sum(np.exp(x),axis=-1)
x1 = np.ones(np.shape(x))
for i inrange(np.shape(x)[0]):
x1[i] = np.exp(x[i])/sum_raw[i]
return x1
y = np.array([[1,0,0],[0,1,0],[0,0,1],[1,0,0],[0,1,0]]) # each There is only one 1 in a row
logits =np.array([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]])
y_pred =softmax(logits)
E1 = -np.sum(y*np.log(y_pred),-1)
print(E1)
sess = tf.Session()
y = np.array(y).astype(np.float64)
E2 = sess.run(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=logits))
print(E2)
The output E1, E2 results are the same
tf.nn.sparse_softmax_cross_entropy_with_logits(_sentinel=None,labels=None,logits=None, name=None)
argument:
_sentinel: Essentially a parameter that is not used, do not need to fill in
logits: a data type (type) is float32 or float64;
shape:[batch_size,num_classes]
labels: shape is [batch_size], labels[i] is an index of {0,1,2,...,num_classes-1}, type is int32 or int64
name: The name of the operation, optional
output:
loss,shape:[batch_size]
Note:
It first calculates the input logits through the softmax function, and then calculates their cross entropy, but it optimizes the calculation method of the cross entropy so that the result does not overflow.
It is suitable for the case where each category is independent and exclusive, a picture The graph can only belong to one category, and cannot contain a dog and an elephant at the same time. The
output is not a number, but the loss of each sample in a batch, so the
calculation formula is generally used with tf.reduce_mean(loss):
Same as tf.nn.softmax_cross_entropy_with_logits(), just convert the labels to the form of labels in tf.nn.softmax_cross_entropy_with_logits()
tf.nn.weighted_cross_entropy_with_logits(labels,logits, pos_weight, name=None)
Calculate sigmoid cross entropy with weights sigmoid_cross_entropy_with_logits()
argument:
_sentinel: Essentially a parameter that is not used, do not need to fill in
logits: a data type (type) is float32 or float64;
shape: [batch_size, num_classes], single sample is [num_classes]
labels: A tensor (tensor) with the same type (float) and shape as logits,
pos_weight: a coefficient for positive samples
name: The name of the operation, optional
output:
loss,shape:[batch_size,num_classes]