文章目录
线性模型
前置
目的
通过学习先进的内置模型,提高建模能力。大佬能把这个模型写出花来!我也要做一个能写出花来的男人!
线性模型要解决的问题
Y = A X + b Y=AX+b Y=AX+b
实现对A和b的逼近。(请记住这个公式,请相信自己能记住他,最起码五分钟!)
1 完整源码
class Linear(Module):
__constants__ = ['in_features', 'out_features']
in_features: int
out_features: int
weight: Tensor
def __init__(self, in_features: int, out_features: int, bias: bool = True) -> None:
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self) -> None:
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
if self.bias is not None:
fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)
def forward(self, input: Tensor) -> Tensor:
return F.linear(input, self.weight, self.bias)
def extra_repr(self) -> str:
return 'in_features={}, out_features={}, bias={}'.format(
self.in_features, self.out_features, self.bias is not None
)
####### torch.nn.functional.py
def linear(input, weight, bias=None):
tens_ops = (input, weight)
if not torch.jit.is_scripting():
if any([type(t) is not Tensor for t in tens_ops]) and has_torch_function(tens_ops):
return handle_torch_function(linear, tens_ops, input, weight, bias=bias)
if input.dim() == 2 and bias is not None:
# fused op is marginally faster
ret = torch.addmm(bias, input, weight.t())
else:
output = input.matmul(weight.t())
if bias is not None:
output += bias
ret = output
return ret
2 知识储备
通过对nn.Module的了解,我们知道一个模型最重要的是__init__()方法和forward()方法的实现:
变量
- in_features: int 输入变量X的维度
- out_features: int 输出变量Y的维度
- weight: Tensor 线性变换(矩阵)A
3 forward()方法
def forward(self, input: Tensor) -> Tensor:
return F.linear(input, self.weight, self.bias)
该方法是实现线性模型运算的方法,他调用了来自torch.nn.functional.py的linear函数:
linear()函数
def linear(input, weight, bias=None):
tens_ops = (input, weight)
if not torch.jit.is_scripting():
if any([type(t) is not Tensor for t in tens_ops]) and has_torch_function(tens_ops):
return handle_torch_function(linear, tens_ops, input, weight, bias=bias)
if input.dim() == 2 and bias is not None:
# fused op is marginally faster
ret = torch.addmm(bias, input, weight.t())
else:
output = input.matmul(weight.t())
if bias is not None:
output += bias
ret = output
return ret
我相信你们记住他了:
Y = A X + b Y=AX+b Y=AX+b
input
即模型中的X
weight
即模型中的A
bias
即模型中的b
核心代码
if input.dim() == 2 and bias is not None:
ret = torch.addmm(bias, input, weight.t())
else:
output = input.matmul(weight.t())
if bias is not None:
output += bias
ret = output
return ret
首先,对于A是二维且有偏置的情况,可以直接使用addmm(b, x, A)方法实现对上述公式的计算,来提高效率,只执行else的内容也可以正常运行。
然后就是else部分:
使用torch.matmul(x, A),完成 A X AX AX的运算,然后判断偏置是否为空,如果空则直接返回 A X AX AX否则计算 A X + B AX+B AX+B并返回,完美的计算了这个模型。
初始化方法
def __init__(self, in_features: int, out_features: int, bias: bool = True) -> None:
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
- 变量赋值
仅分析:self.in_features = in_features self.out_features = out_features self.weight = Parameter(torch.Tensor(out_features, in_features)) if bias: self.bias = Parameter(torch.Tensor(out_features))
self.weight = Parameter(torch.Tensor(out_features, in_features))
创建一个 o u t _ f e a t u r e s ∗ i n _ f e a t u r e s out\_features*in\_features out_features∗in_features的矩阵(二维_张量[Tensor]),并转换成Parameter类型(Tensor的子类),作为可学习的参数 self.reset_parameters()
定义:
调用了:def reset_parameters(self) -> None: init.kaiming_uniform_(self.weight, a=math.sqrt(5)) if self.bias is not None: fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight) bound = 1 / math.sqrt(fan_in) init.uniform_(self.bias, -bound, bound)
基于kaiming均匀分布对参数进行初始化(也可以是清空)def kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu'): fan = _calculate_correct_fan(tensor, mode) gain = calculate_gain(nonlinearity, a) std = gain / math.sqrt(fan) bound = math.sqrt(3.0) * std # Calculate uniform bounds from standard deviation with torch.no_grad(): return tensor.uniform_(-bound, bound)