1. Introduction to Numpy

A scientific computing implemented in python, including: 1. A powerful N-dimensional array object Array; 2. A relatively mature (broadcast) function library; 3. A toolkit for integrating C/C++ and Fortran codes; 4. Practical Linear algebra, Fourier transform, and random number generation functions. It is more convenient to use numpy and the sparse matrix operation package scipy.
NumPy (Numeric Python) provides many advanced numerical programming tools, such as matrix data types, vector processing, and sophisticated arithmetic libraries. Built for serious digital manipulation. It is mostly used by many large financial companies, as well as core scientific computing organizations such as: Lawrence Livermore, NASA uses it to handle some tasks that would have been done using C++, Fortran or Matlab.

---"Baidu Encyclopedia"

Two, Numpy knowledge points

1. Install the Numpy package and import

pip install numpy
import numpy as np

2. Create an array

a = np.array([1, 2, 3, 4, 5])
b = np.arange(0, 6, 1)
c = np.random.random((3, 3))
d = np.random.randint(0, 9, size=(3, 3))
print("a:", a)
print("b:", b)
print("c:", c)
print("d:", d)

① np.array creates an array intuitively, and whatever is input is generated.

②np.arange creates an array, and needs to specify the last digit and step size of the start value and end value.

③np.random.random To create an array, the dimension of the array needs to be given, and its elements are random numbers between 0-1.

④np.random.randint creates an N-order array, which needs to specify the interval to which the element belongs and the dimension of the array. Elements are integers.

a: [1 2 3 4 5]
b: [0 1 2 3 4 5]
c: [[0.10739085 0.99541616 0.76174493]
 [0.30140398 0.87467374 0.30959958]
 [0.23803194 0.47848497 0.38842102]]
d: [[2 0 4]
 [7 0 7]
 [3 4 5]]

3. Special functions

zeros = np.zeros((1, 4))
ones = np.ones((2, 2))
full = np.full((2, 3), 9)
eye = np.eye(3)

① np.zeros generates an array with specified dimensions and all elements are 0

②np.ones generates an array with specified dimensions and all elements are 1

③np.full needs to specify the dimension of the array and the number to fill the array

④np.eye generates an N-order array, except that the diagonal elements are 1, and other elements are 0

zero: [[0. 0. 0. 0.]]
ones: [[1. 1.]
 [1. 1.]]
full [[9 9 9]
 [9 9 9]]
eye: [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Note: The data types in the array must be consistent, including integer and floating point.

4. Data type

type of data	describe	unique identifier
bool	Boolean type (True or False) stored in one byte	b
you8	One byte size, -128 to 127	i1
int16	Integer, 16-digit integer (-32768 ~ 32767)	i2
int32	Integer, 32-digit integer (-2147483648 ~ 2147483647)	i4
int64	Integer, 64-digit integer (-9223372036854775808 ~ 9223372036854775807)	i8
uint8	unsigned integer, 0 to 255	in1
uint16	unsigned integer, 0 to 65535	u2
uint32	unsigned integer, 0 to 2 ** 32 - 1	u4
uint64	unsigned integer, 0 to 2**64 - 1	u8
float16	Half-precision floating-point number: 16 bits, 1 bit for sign, 5 bits for exponent, 10 bits for precision	f2
float32	Single-precision floating-point number: 32 bits, 1 bit for sign, 8 bits for exponent, and 23 bits for precision	f4
float64	Single-precision floating-point number: 64 bits, 1 bit for sign, 11 bits for exponent, and 52 bits for precision	f8
complex64	For complex numbers, the real and imaginary parts are represented by two 32-bit floating-point numbers, respectively	c8
complex128	For complex numbers, the real and imaginary parts are represented by two 64-bit floating-point numbers respectively	c16
object_	python object	U
string_	string	S
unicode_	unicode type	U

①Create an array to specify the data type

zeros = np.zeros((1, 4), dtype='int16')
ones = np.ones((2, 2), dtype='float32')

zero: [[0 0 0 0]]
ones: [[1. 1.]
 [1. 1.]]

②Query data type

full = np.full((2, 3), 9, dtype='int8')
eye = np.eye(3, dtype='float16')

print(full.dtype)
print(eye.dtype)

int8
float16

③Modify data type

full = np.full((2, 3), 9, dtype='int8')
print(full.dtype)
full = full.astype('int16')
print(full.dtype)

int8
int16

Notice:

1. Numpy is written based on C language and refers to the data type of C language.

2. Assigning different data types to different data can effectively save space.

5. Multidimensional array

(In plain language, how many '[' are there in front of the first number that appears, then how many dimensions is the array.)

① Define 1-dimensional, 2-dimensional, 3-dimensional arrays

arr1 = np.array([1,2,3])             # 一维数组
arr2 = np.array([[1,2,3],[4,5,6]])   # 二维数组
arr3 = np.array([                    # 三维数组
    [
        [1,2,3],
        [4,5,6]
    ],  
     [
        [7,8,9],
        [10,11,12]
        ]
])

②Array dimension query

print(arr1.shape)
print(arr2.shape)
print(arr3.shape)

③Modify the array shape

a1 = np.array([            # 创建新数组a1
    [
        [1,2,3],
        [4,5,6]
    ],
    [
        [7,8,9],
        [10,11,12]
    ]
])
a2 = a1.reshape((2,6))     # 保持元素个数不变的情况下，修改形状为2*6
a3 = a2.flatten()          # 铺平为一维向量

④ View the number of elements and the occupied memory

print(a1.size)               
print(a1.itemsize)
print(a1.itemsize * a1.size)

in:

1.a1.size represents the number of array elements

2.a1.itemsize represents the memory occupied by a single element, in bytes

3. The multiplication of the two represents the total memory occupied by the array

6. Array indexing and slicing

① One-dimensional array

a1 = np.arange(10)   # [0 1 2 3 4 5 6 7 8 9]   
print(a1[4])         # 索引操作
print(a1[4:6])       # 切片操作
print(a1[::2])       # 使用步长
print(a1[-1])        # 使用负数作为索引（从右往左数第一位数字）

[0 1 2 3 4 5 6 7 8 9]
4
[4 5]
[0 2 4 6 8]
9

② Two-dimensional array

arr2 = np.random.randint(0,10,size=(4,6))
print(arr2[0])            # 获取第0行数据
print(arr2[1:3])          # 获取第1,2行数据
print(arr2[[0, 2, 3]])      # 获取0,2,3行数据
print(arr2[2, 1])          # 获取第二行第一列数据
print(arr2[[1, 2], [4,5]])  # 获取多个数据 例:第一行第四列、第二行第五列数据
print(arr2[1:3, 4:6])      # 获取多个数据 例:第一、二行的第四、五列的数据
print(arr2[:, 1])          # 获取某一列数据 例:第一列的全部数据
print(arr2[:, [1,3]])      # 获取多列数据 例:第一、三列的全部数据

③ Boolean index

a3 = np.arange(24).reshape((4,6))
print(a3[a3<10])              # 挑选出小于10的元素
print(a3[(a3 < 5) | (a3 > 10)])    # 挑选出小于5或者大于10的元素

illustrate:

1. Whether the Boolean index is extracted by True or False on the same data

2. At the same time meet the use &, meet one of them can be used|

3. When there are multiple conditions, each condition is enclosed in parentheses

7. Replacement of array element values:

① Index

a3 = np.random.randint(0,10,size=(3,5))
a3[1] = 0                      # 将第一行数据全部更换为0
a3[1] = np.array([1,2,3,4,5])  # 将a3数组第一行数据更换为[1,2,3,4,5]

②Condition index

a3[a3 < 3] = 1   # 数组中值小于3的元素全部替换为1

③ function (use the where function to realize the replacement value)

result = np.where(a3<5,0,1)

The function of the code is to replace all values less than 5 in the a3 array with 0, and replace the remaining elements with 1

8. Array broadcasting mechanism

Array broadcasting principle: Two arrays are broadcast compatible if the axis lengths of the trailing dimension (that is, the dimension counting from the end) match or one of them has a length of 1. Broadcasting is done on all and/or length-1 dimensions.

①Arrays and digital operations

a1 = np.random.randint(0,5,size=(3,5))
print(a1*2)                # 数组中的所有元素都乘2
print(a1.round(2))         # 数组中所有的元素只保留2位小数

②Array and array operations

a1 = np.random.randint(0,5,size=(3,5))
a2 = np.random.randint(0,5,size=(3,5)) # a1+a2满足数组广播机制：形状一致
a3 = np.random.randint(0,5,size=(3,4)) # a1+a3不满足：形状不一致的数组不能相加减
a4 = np.random.randint(0,5,size=(3,1)) # a1+a4满足：俩数组行数相同，其中一个数组列数为1
a5 = np.random.randint(0,5,size=(1,5)) # a1+a5满足，俩数组列数相同，其中一个数组行数为1

Summarize:

1. Arrays can directly operate with numbers

2. The arrays desired by the two shapes can be operated

3. If two arrays with different shapes want to perform operations, it depends on whether the two satisfy the broadcasting principle

9. Operations on array shapes

①Change of array shape

a1 = np.random.randint(0,10,size=(3,4))
a2 = a1.reshape((2,6))           # 有返回
a1.resize((4,3))                 # 无返回

The difference between reshape and resize:

Both reshape and resize are used to modify the shape of the array, but the results are different. reshape converts an array into a specified shape, and then returns the converted result. resize converts an array into a specified shape, directly modifies the array itself, and does not return any value.

②flatten and ravel (both convert multidimensional arrays into one-dimensional arrays, but in different ways)

a3 = np.random.randint(0,10,size=(3,4))
a4 = a3.flatten()          # 拷贝一份返回
a5 = a3.ravel()            # 返回这个视图的引用

That is to say:

Modifying the value of a4 will not affect a3; but modifying the value of a5 will also modify the value of a3.

③ Superposition of arrays

vstack: stands for stacking in the vertical direction, and the number of columns must be consistent if you want to stack successfully;

hstack: stands for stacking in the horizontal direction, and the number of rows must be consistent in order to stack successfully;

concatenate: You can manually specify in which direction the axis parameter is superimposed.

1>axis = 0 means stacking in the horizontal direction

2>axis = 1 means stacking in the vertical direction

3>axis = None represents superposition first, and then converted into a 1-dimensional array

vstack1 = np.random.randint(0,10,size=(3,4))       # 垂直方向待叠加的数组
vstack2 = np.random.randint(0,10,size=(2,4))       # 垂直方向待叠加的数组
vstack3 = np.vstack([vstack1,vstack2])             # 垂直叠加方法一
vstack4 = np.concatenate([vstack1,vstack2],axis=0) # 垂直叠加方法二


h1 = np.random.randint(0,10,size=(3,4))            # 水平方向待叠加的数组
h2 = np.random.randint(0,10,size=(3,1))            # 水平方向待叠加的数组
h3 = np.hstack([h2,h1])                            # 水平叠加方法一
h4 = np.concatenate([h2,h1],axis=1)                # 水平叠加方法二
 
h5 = np.concatenate([h2,h1],axis=None)             # 先识别垂直或者水平叠加，后转换为一维数组

④ Array cutting

hsplit: Represents cutting in the horizontal direction and cutting by columns. The cutting method is as follows:

1. Directly specify how many columns to cut into on average

2. Specify the subscript value of the cut

vsplit: Represents cutting in the vertical direction and cutting by lines. Cut in the same way as hsplit.

hs1 = np.random.randint(0, 10, size=(3, 4))
np.hsplit(hs1, 2)                         # 水平方向平均分为2份，要求列数可被此整数整除
np.hsplit(hs1, (1, 2))                     # 水平方向分为1，1，2列（在1，2处切割）

[array([[9, 4],
       [4, 2],
       [4, 7]]), array([[4, 6],
       [9, 6],
       [7, 3]])]
[array([[9],
       [4],
       [4]]), array([[4],
       [2],
       [7]]), array([[4, 6],
       [9, 6],
       [7, 3]])]

vs1 = np.random.randint(0, 10, size=(4, 5)) 
np.vsplit(vs1, 4)                         # 垂直方向平均分为4份
np.vsplit(vs1, (1, 3))                     # 垂直方向分为1,2,1行

[array([[0, 3, 7, 4, 7]]), array([[0, 1, 2, 1, 1]]), array([[9, 8, 4, 7, 5]]), array([[9, 8, 2, 7, 1]])]

[array([[0, 3, 7, 4, 7]]), array([[0, 1, 2, 1, 1],
       [9, 8, 4, 7, 5]]), array([[9, 8, 2, 7, 1]])]

⑤Matrix transposition

t1 = np.random.randint(0,10,size=(3,4))
t1.T                           # 数组t1转置
t2 = t1.transpose()            # 返回的是一个view，对返回值上进行修改会影响到原来的数组

10. View and copy

① If it is just a simple assignment, then no copy will be made

a = np.arange(12)
b = a
print(b is a)       # 返回为True，说明b和a是相同的

② shallow copy

In some cases, variables will be copied, but the memory space they point to is the same, then this situation is called shallow copy, or View (view)

c = a.view()
print(c is a) # 返回false，说明c和a栈区空间不同，但是所指向的内存空间是一样的
c[0] = 100    # 修改c的值，a也会受到影响

③ deep copy

Put a complete copy of the previous data into another memory space, so that there are two completely different values.

d = a.copy()
print(d is a)   # 返回false，说明在不同栈区
d[1]=200        # 数组d被修改，而a原封不动，说明两者内存空间不一样。

Summarize:

There are three types of copies in array operations:

1. No copy: direct assignment, then the stack area is not copied, but different names are defined with the same stack area.

2. Shallow copy: only the stack area is copied, and the heap area specified by the stack area is not copied.

3. Deep copy: both the stack area and the heap area are copied

To be continued~

Python Numpy Knowledge Points