Table of contents
1. Common operations
2. Matters needing attention
3. Others
1. Common operations
The "nadrray class" and "tensor class" in python are both the embodiment of arrays in linear algebra. When learning, you can start from these aspects: common library functions, class object methods, class object attributes, operations between class objects, and acquisition of class object elements (slicing)
1.1: torch library function
pytorch library | numpy library |
torch.tensor() |
np.array() |
torch.ones() | np.ones() |
torch.zeros() | np.zeros() |
torch.arange() | np.arange() |
torch.randn() | np.random.rand() |
torch.exp() | np.exp() |
Note: These are some commonly used functions, and the dtype can be specified, such as dtype=torch.float32, float32 type data is commonly used in deep learning
1.2 torch class object method
Suppose x is a tensor object, common methods are: x.reshape() x.numel() (calculate the number of elements in X) len(x) outputs the dimension of the first axis x.mean() x.sum() xT
1.3 torch class object attributes
x.shape x.dim
1.4: Operations between tensors
Addition (+), subtraction (-), multiplication (X), division (÷), equality (=; !=), dot product torch.dot(), torch.mv(), torch.mm()
1.5 Acquisition of class object elements (slicing)
slice (with square brackets) x[ ]
1.6: Convert to numpy object
x.numpy()
Summary: The main interface for deep learning to store and manipulate data is tensors. It provides various functions including basic mathematical operations, broadcasting, indexing, slicing, memory saving and converting other Python objects.
2. Matters needing attention
1: About ndarray object slicing
import numpy as np
num = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(num)
print(num[[0, 1], [0, 1]])
result:
[[1 2 3]
[4 5 6]
[7 8 9]]
[1 5]
The slicing method in numpy is a bit strange. num[[0, 1], [0, 1]] represents an array composed of two elements in the first row, the first column and the second row, the second column.
2. About the use of sum() method in numpy and pytorch
Sum: x.sum(axis=0), meaning: sum along the 0 axis to generate an output vector. Non-summary summation method: the parameter keepdims=True can be specified
If aggregated along the k-axis, the dimensionality of the k-axis is lost
A.cumsum() function, a summary function, this function does not summarize the input tensor along any axis.
num.sum(axis=0)
result:
array([12, 15, 18])
3. Matrix multiplication
[The element-wise multiplication of two matrices is called the Hadamard product (mathematical symbol ⊙)] torch.dot()
Key points: Python and matlab are different. * in python means matrix multiplication by elements , while in matlab it means matrix multiplication
torch.mv() matrix-vector multiplication torch.mm() matrix-matrix multiplication
4. Common norms in deep learning
Among them, the subscript 2 is often omitted in the L2 norm, that is, ∥x∥ is equivalent to ∥x∥2, method: torch.norm(x)
Common norms for deep learning, such as L2 norm L1 norm, number and Frobenius norm.
5. Slicing problem
matlab:start:step:end
python:start:end:step
3. Others
1. pd.get_dummies (one-hot encoding processing)
Idea: Convert the data in pandas to tensor through torch.tensor(DataFrame.values())
2. Derivative, gradient, automatic derivation
The relationship between gradient and derivative?
Gradient: The gradient of the function f(x) with respect to x is a vector containing n partial derivatives .
Derivatives are divided into full derivatives and partial derivatives, only positive and negative, no direction.
3. Basic knowledge of probability theory
fair_probs = torch.ones([6]) / 6
counts = multinomial.Multinomial(1000, fair_probs).sample()#模拟掷1000次骰子
Bayes ' theorem. It is shown below. By construction, we have the multiplication rule , P(A,B)=P(B∣A)P(A). By symmetry, this also holds for P(A,B)=P(A∣B)P(B). Assuming P(B)>0, solving for one of the conditional variables, we get