torch.linalg.norm gives different results on tensors of different dimensions: explore the reasons and solutions

The gods are silent-personal CSDN blog post directory

(Versions I use: Python 3.8.13; PyTorch 1.11.0)

The L2 norm is used to calculate the size of the tensor. It is intuitive to think of it torch.linalg.norm, and then it is found that tensors of different dimensions will give different answers:
(c is used for comparison. I mainly only consider a and problem of b)

import torch
from torch import linalg as LA

a=torch.arange(8).float()
b=a.reshape(4,2)
c=a.reshape(-1,1)
print(LA.norm(a,ord=2))
print(LA.norm(b,ord=2))
print(LA.norm(c,ord=2))

output:

tensor(11.8322)
tensor(11.8079)
tensor(11.8322)

I asked ChatGPT and said that it might be because of floating point precision, but I didn’t explain why in the end (when will ChatGPT speed up and replace me, why do I need to search for solutions to this kind of problem), so I checked the official API documentation myself: torch.linalg.norm — PyTorch 2.0 documentation

torch.linalg.norm(A, ord=None, dim=None, keepdim=False, *, out=None, dtype=None) → Tensor

insert image description here

ordThe input parameter specifies the calculation paradigm. Here we originally wanted to calculate the L2 paradigm, so choose 2:
insert image description here
insert image description here
insert image description here

For one-dimensional vectors, the algorithm is obviously the traditional L2 paradigm ( ∑ ∣ x ∣ 2 \sqrt{\sum|x|^2}x2 ); but for a two-dimensional matrix, the maximum singular value is calculated here.
In fact, the second norm of a matrix is ​​equal to its maximum singular value, but I don't know how to derive it, and I haven't learned it. You can refer to this blog post 1 . In short, from this blog post, we know that the largest singular value is ATAA^TAAT is the square root of the largest eigenvalue ofA.
But the results obtained by the code are different, so we naturally guess that these two calculation methods lead to different results, so we calculate by hand:

import torch
from torch import linalg as LA

import math

a=torch.arange(8).float()
print(LA.norm(a,ord=2))
print(math.sqrt(1**2+2**2+3**2+4**2+5**2+6**2+7**2))

b=a.reshape(4,2)
print(LA.norm(b,ord=2))
(evals,evecs) = torch.eig(torch.mm((b.T),b),eigenvectors=True)
print(torch.max(evals))
print(math.sqrt(torch.max(evals)))

c=a.reshape(-1,1)
(evals,evecs) = torch.eig(torch.mm((c.T),c),eigenvectors=True)
print(torch.max(evals))
print(math.sqrt(torch.max(evals)))

output:

tensor(11.8322)
11.832159566199232
tensor(11.8079)
this_file.py:12: UserWarning: torch.eig is deprecated in favor of torch.linalg.eig and will be removed in a future PyTorch release.
torch.linalg.eig returns complex tensors of dtype cfloat or cdouble rather than real tensors mimicking complex tensors.
L, _ = torch.eig(A)
should be replaced with
L_complex = torch.linalg.eigvals(A)
and
L, V = torch.eig(A, eigenvectors=True)
should be replaced with
L_complex, V_complex = torch.linalg.eig(A) (Triggered internally at  /opt/conda/conda-bld/pytorch_1646755853042/work/aten/src/ATen/native/BatchLinearAlgebra.cpp:2910.)
  (evals,evecs) = torch.eig(torch.mm((b.T),b),eigenvectors=True)
tensor(139.4262)
11.807888200473563
tensor(140.)
11.832159566199232

In this way, the difference between the two due to precision can be clearly seen.

Solution: If you just want to calculate the standard L2 paradigm, please change the tensor directly into a one-dimensional vector.

Other network materials referenced in the process of writing this article:

  1. Matrix multiplication in pytorch: functions mul, mm, mv and @ operation and * operation
  2. The maximum and minimum values ​​of the eigenvalues ​​- Zhihu
  3. Linear Algebra in Pytorch - Short Book

  1. Why is the second norm of the matrix equal to its singular value - EpsAvlc's blog ↩︎

Guess you like

Origin blog.csdn.net/PolarisRisingWar/article/details/131044187