【DeepLearning】【PyTorch ()】Pytorch Loss functions

本文是 Pytorch 中 Loss 函数总结。主要依据 torch.nn.modules.loss 中的类封装 Loss functions 和 torch.nn.functional 中的函数封装 Loss functions。

0. 说明

值得注意的是，很多的 loss 函数都有 size_average 和 reduce 两个布尔类型的参数，需要解释一下。因为一般损失函数都是直接计算 batch 的数据，因此返回的 loss 结果都是维度为 (batch_size, ) 的向量。

如果 reduce = False，那么 size_average 参数失效，直接返回向量形式的 loss；
如果 reduce = True，那么 loss 返回的是标量
如果 size_average = True，返回 loss.mean();
如果 size_average = True，返回 loss.sum();
所以下面讲解的时候，一般都把这两个参数设置成 False，这样子比较好理解原始的损失函数定义。

1. L1Loss

类封装：

CLASS torch.nn.L1Loss(size_average=None, reduce=None, reduction='mean')

计算输入 x 和目标 y 之间的平均绝对差（mean absolute error，MAE）。
$\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = \left| x_n - y_n \right|,$
这里， $N$ 表示 batch size。如果 reduce 设为 True，那么
$\ell(x, y) = \begin{cases} \operatorname{mean}(L), & \text{if size\_average} = \text{True;}\\ \operatorname{sum}(L), & \text{if size\_average} = \text{False.} \end{cases}$
$x$ 和 $y$ 是具有相同维度的任意形状张量，各自都具有 $n$ 个元素。

1.1 参数

size_average (布尔类型, 可选参数)
已过时（Deprecated）（见 reduction）。一般地，losses 损失函数值为 batch 中对所有 loss 元素的平均值。这里注意，对有些类型的损失函数，在单个训练样本中存在多个元素。如果 size_average 域设为 False，losses 损失函数值为 minibatch 中对所有 loss 元素的求和。当 reduce 设为 False 时，忽略 size_average 域。缺省为：True
（size_average 决定 loss 是求平均还是求和。默认求平均。）
reduce (布尔类型, 可选参数)
已过时（Deprecated）（见 reduction）。一般地，losses 损失函数值为 minibatch 中对所有 loss 张量元素的平均值或求和，这取决于 size_average 域的设置。当 reduce 为 False，返回 batch 中每个样本的 loss 值，并忽略 size_average。缺省为：True
（reduce 决定是求整个 batch 的 loss 值，还是求 batch 中每个 sample 的 loss 值。默认求整个 batch 的 loss 值。）
reduction (字符串类型, 可选参数) ’
确定对 loss 输出结果应用 reduction 的类型： ‘none’ | ‘mean’ | ‘sum’。‘none’：无 reduction 被应用。‘mean’：对输出结果求和并除以输出结果的元素个数。 ‘sum’：对输出结果求和。注意，size_average 和 reduce 将在后续中被弃用（being deprecated），但与此同时，这两个参数的设置将覆盖 reduction。缺省为：True
（reduction 的作用等同于 size_average + reduce。‘none’ 为求 minibatch 中每个 sample 的 loss 值。‘mean’ 为求整个 minibatch 的 loss 值，对 minibatch 中所有 sample 的 loss 值求平均。‘sum’ 为求整个 minibatch 的 loss 值，对 minibatch 中所有 sample 的 loss 值求和。）

1.2 形状

Input: $(N, *)$ 这里 $*$ 表示其它任意的维度
Target: $(N, *)$ 同上
Output: 标量. 如果 reduce 设为 False, 那么 $(N, *)$ 与 Input 相同

1.3 例子

>>> loss = nn.L1Loss()
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randn(3, 5)
>>> output = loss(input, target)
>>> output.backward()

函数封装

torch.nn.functional.l1_loss(input, target, size_average=None, reduce=None, reduction='mean') → Tensor

2. MSELoss

类封装

CLASS torch.nn.MSELoss(size_average=None, reduce=None, reduction='mean')

$\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = \left( x_n - y_n \right)^2,$
where $N$ is the batch size. If reduce is True, then:
$\ell(x, y) = \begin{cases} \operatorname{mean}(L), & \text{if}\; \text{size\_average} = \text{True},\\ \operatorname{sum}(L), & \text{if}\; \text{size\_average} = \text{False}. \end{cases}$

函数封装

扫描二维码关注公众号，回复： 5149503 查看本文章

torch.nn.functional.mse_loss(input, target, size_average=None, reduce=None, reduction='mean') → Tensor

3. CrossEntropyLoss

类封装

CLASS torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

input has to be a Tensor of size either $(minibatch, C)$ or $(minibatch, C, d_1, d_2, ..., d_K)$ with $K \geq 2$ for the K-dimensional case (described later).

$\text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right)$

$\text{loss}(x, class) = weight[class] \left(-x[class] + \log\left(\sum_j \exp(x[j])\right)\right)$

函数封装

torch.nn.functional.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')