Deep learning to review new knowledge points

1. x*y is element-wise multiplication, also called Hadamard product x**y is also

2. cat dim=0 stacks rows together, there are a lot more rows, dim=1 columns pile up a lot more columns

3.print(z.sum())#The sum of all values ​​​​finally becomes a value, print(z.numel())#The number of all values 

        The full mean of z.mean() is equal to z.sum()/z.numl()

4. Matrix paradigm matrix is ​​pulled into a vector to calculate the length f long

5. Matrix transpose A=AT

6. The sum will lose a dimension, in order not to lose the sum () parameter keepdims=True

7.A.cumsum cumulative sum is interesting

8. The inner product torch.dot is multiplied element by element and finally added to return a scalar

9.torch.mv

10. One dimension in torch must be a row vector, and the column vector must be a matrix

11. When the sub-derivative is not differentiable, the derivative at this point is between the left and right ranges

12.<x,w> inner product of x and w

13.

x=torch.orange(4)

y1 = 2 * torch.dot(x, x)#y1 is a scalar that x is multiplied by itself and then multiplied by 2 28

y2=x * x #y2 is multiplied by itself, vector [0,1,4,9]

for each element in the vector. Non-linear, so the nonlinear partial derivative after sum is equivalent to ignoring other elements? So backward can also be used as a scalar

14. The print parameter end = '\r' leads to no printing and you need to press enter once. This is due to the fact that the feature '\r' of linux is different from win10. 

15. detach from the calculation graph, where it can be detached yto return a new variable uwith ythe same value as , but discard yany information about how to calculate in the calculation graph. In other words, gradients do not flow backwards uthrough x.

u = y.detach() The gradient is gone

 so there will be

 16. I can't understand the gradient calculation of Python control flow

Guess you like

Origin blog.csdn.net/qq_36632604/article/details/129970146
Recommended