The gods are silent-personal CSDN blog post directory
(Versions I use: Python 3.8.13; PyTorch 1.11.0)
The L2 norm is used to calculate the size of the tensor. It is intuitive to think of it torch.linalg.norm
, and then it is found that tensors of different dimensions will give different answers:
(c is used for comparison. I mainly only consider a and problem of b)
import torch
from torch import linalg as LA
a=torch.arange(8).float()
b=a.reshape(4,2)
c=a.reshape(-1,1)
print(LA.norm(a,ord=2))
print(LA.norm(b,ord=2))
print(LA.norm(c,ord=2))
output:
tensor(11.8322)
tensor(11.8079)
tensor(11.8322)
I asked ChatGPT and said that it might be because of floating point precision, but I didn’t explain why in the end (when will ChatGPT speed up and replace me, why do I need to search for solutions to this kind of problem), so I checked the official API documentation myself: torch.linalg.norm — PyTorch 2.0 documentation
torch.linalg.norm(A, ord=None, dim=None, keepdim=False, *, out=None, dtype=None) → Tensor
ord
The input parameter specifies the calculation paradigm. Here we originally wanted to calculate the L2 paradigm, so choose 2
:
For one-dimensional vectors, the algorithm is obviously the traditional L2 paradigm ( ∑ ∣ x ∣ 2 \sqrt{\sum|x|^2}∑∣x∣2); but for a two-dimensional matrix, the maximum singular value is calculated here.
In fact, the second norm of a matrix is equal to its maximum singular value, but I don't know how to derive it, and I haven't learned it. You can refer to this blog post 1 . In short, from this blog post, we know that the largest singular value is ATAA^TAAT is the square root of the largest eigenvalue ofA.
But the results obtained by the code are different, so we naturally guess that these two calculation methods lead to different results, so we calculate by hand:
import torch
from torch import linalg as LA
import math
a=torch.arange(8).float()
print(LA.norm(a,ord=2))
print(math.sqrt(1**2+2**2+3**2+4**2+5**2+6**2+7**2))
b=a.reshape(4,2)
print(LA.norm(b,ord=2))
(evals,evecs) = torch.eig(torch.mm((b.T),b),eigenvectors=True)
print(torch.max(evals))
print(math.sqrt(torch.max(evals)))
c=a.reshape(-1,1)
(evals,evecs) = torch.eig(torch.mm((c.T),c),eigenvectors=True)
print(torch.max(evals))
print(math.sqrt(torch.max(evals)))
output:
tensor(11.8322)
11.832159566199232
tensor(11.8079)
this_file.py:12: UserWarning: torch.eig is deprecated in favor of torch.linalg.eig and will be removed in a future PyTorch release.
torch.linalg.eig returns complex tensors of dtype cfloat or cdouble rather than real tensors mimicking complex tensors.
L, _ = torch.eig(A)
should be replaced with
L_complex = torch.linalg.eigvals(A)
and
L, V = torch.eig(A, eigenvectors=True)
should be replaced with
L_complex, V_complex = torch.linalg.eig(A) (Triggered internally at /opt/conda/conda-bld/pytorch_1646755853042/work/aten/src/ATen/native/BatchLinearAlgebra.cpp:2910.)
(evals,evecs) = torch.eig(torch.mm((b.T),b),eigenvectors=True)
tensor(139.4262)
11.807888200473563
tensor(140.)
11.832159566199232
In this way, the difference between the two due to precision can be clearly seen.
Solution: If you just want to calculate the standard L2 paradigm, please change the tensor directly into a one-dimensional vector.
Other network materials referenced in the process of writing this article:
- Matrix multiplication in pytorch: functions mul, mm, mv and @ operation and * operation
- The maximum and minimum values of the eigenvalues - Zhihu
- Linear Algebra in Pytorch - Short Book