2020-12-19 nn.CrossEntropyLoss()

nn.CrossEntropyLoss() instance understanding:

Specific understanding for PICA:

The following can be understood as the loss corresponding to a row in the K*K PUI:

insert image description hereAmong them, x can be understood as a row in the PUI of K*K ; cluster_index refers to the corresponding element in the row; the denominator part is the cumulative sum of all elements in the row.

CrossEntropyLoss(input, target)

1.

input:
entroy=nn.CrossEntropyLoss()
input=torch.Tensor([[-0.7715, -0.6205, -0.2562],
                    [-0.7715, -0.6205,  -0.2562],
                    [-0.7715, -0.6205,  -0.2562]])
target = torch.tensor([0, 0, 0])
# target = torch.arange(3)
output = entroy(input, target)
print(output)
output : tensor(1.3447)

target corresponds to the first element to be sought in a certain obtained feature vector.

(1)
-x[0] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.7715 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.3447
(2)
-x[0] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.7715 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.3447
(3)
-x[0] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.7715 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.3447

loss = [(1) + (2) + (3)] /3 = 1.3447

2.

input:
entroy=nn.CrossEntropyLoss()
input=torch.Tensor([[-0.7715, -0.6205, -0.2562],
                    [-0.7715, -0.6205,  -0.2562],
                    [-0.7715, -0.6205,  -0.2562]])
target = torch.tensor([1, 1, 1])
# target = torch.arange(3)
output = entroy(input, target)
print(output)
output : tensor(1.1937)

(1)
-x[1] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.6205 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.1937
(2)
-x[1] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.6205 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.1937
(3)
-x[1] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.6205 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.1937

loss = [(1) + (2) + (3)] / 3 = 1.1937

3.

input:
entroy=nn.CrossEntropyLoss()
input=torch.Tensor([[-0.7715, -0.6205, -0.2562],
                    [-0.7715, -0.6205,  -0.2562],
                    [-0.7715, -0.6205,  -0.2562]])
target = torch.tensor([2, 2, 2])
# target = torch.arange(3)
output = entroy(input, target)
print(output)
output :tensor(0.8294)

(1)
-x[2] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.2562 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 0.8294
(2)
-x[2] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.2562 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 0.8294
(3)
-x[2] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.2562 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 0.8294

loss = [(1) + (2) + (3)] / 3 = 0.8294

4.

input:
entroy=nn.CrossEntropyLoss()
input=torch.Tensor([[-0.7715, -0.6205, -0.2562],
                    [-0.7715, -0.6205,  -0.2562],
                    [-0.7715, -0.6205,  -0.2562]])
target = torch.tensor([0, 1, 2])   # 或 target = torch.arange(3)
# target = torch.arange(3)
output = entroy(input, target)
print(output)
output :tensor(1.1226)

(1)
-x[0] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.7715 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.3447
(2)
-x[1] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.6205+ log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 1.1937
(3)
-x[2] + log(exp(x[0]), exp(x[1]), exp(x[2])) =
0.2562 + log(exp(-0.7715) + exp(-0.6205) + exp(-0.2562)) = 0.8294

loss = [(1) + (2) + (3)] / 3 = 1.1226

Guess you like

Origin blog.csdn.net/weixin_41807182/article/details/111412005