Machine learning basic study notes [3]

1. Introduction to Numpy

1.1 Numpy

  • used forScientific Computing. Powerful scientific computing functions rely on two basic objects: the multi-dimensional array object ndarray and the universal function
    ufunc
  • Use the pip install numpy command to install Numpy. To install in jupyter, you need to add it before pip! .

1.2 Pandas

  • used fordata analysis
  • Built on Numpy, which makes Numpy-centric applications simple
  • Two basic data structures are provided:Series and DataFrame

1.3 Matplotlib

  • is a PythonVisualization module, which allows users to easily create line graphs, pie charts, bar charts and other professional
    graphics

1.4 Scikit-learn

is based on pythonMachine learning module, the basic functions are mainly divided into 6 parts:Classification, regression, clustering, data dimensionality reduction, model selection and data preprocessing
sk-learn comes with some classic data sets, such as the iris and digits data sets for classification, the boston house prices data set for regression analysis, etc.

2.Numpy data type

  • The four data types of int8, int16, int32, and int64 can be represented by strings i1, i2, i4, and i8.
  • Numpy data types

2.1 Data type object (dtype)

  • Generally used as a parameter of the array() function to specify the data type of the array data item. The parameter value is enclosed in quotation marks.
  • You can also get the data type of the array elements
import numpy as np 
na1=np.array(5,dtype='float32') # 查看ndarray、数据类型及数据项类型 
print(na1,type(na1),na1.dtype) # 查看变量的数据、数据元素个数,维数等信息 
print(na1.data,na1.size,na1.ndim) # 注意,单个标量是0维、加上一层中括号是1维……
5.0 <class 'numpy.ndarray'> float32 
<memory at 0x000001ECDEFEFE00> 1 0

2.2.astype() method modifies data type

import numpy as np 
na=np.array(5,dtype='i2') 
print(na.dtype) 
print(na.astype('i4').dtype) 
print(na.astype(np.float32).dtype)
int16 
int32 
float32

3.Numpy array properties

3.1 Commonly used terms

  • axis,That is, the dimensions and coordinates of the array
  • Rank,The number of dimensions, that is, the number of axes, the rank of a one-dimensional array is 1
  • Coordinate system (axes), used later in matplotlib
  • For example, [1,2,3] is a one-dimensional array with 1 axis, rank 1, and the length of the axis is 3
# 使用axis参数设置当前轴 
import numpy as np 
arr=np.array([[0,1,2],[3,4,5]]) 
print(arr) 
print(arr.sum(axis=0)) 
print(arr.sum(axis=1))
[[0 1 2]
 [3 4 5]] 
[3 5 7] 
[ 3 12]

3.2 Basic properties

  1. nddarray.ndim: rank, that is, the dimension of the array dimension. return int
  2. ndarray.shape: The size of the array. For a matrix with n rows and m columns, the shape value is (n, m). Return tuple
  3. ndarray.size: The total number of elements in the array, its value is n×m in ndarray.shape. return int
  4. ndarray.dtype: The type of each element, which can be np.int32, np.int16, np.float64, etc. Return data-type
  5. ndarray.itemsize: each elementSize in bytes occupied. Equivalent to ndarray.dtype.itemsize. return int
  6. ndarray.data: points to data memory
import numpy as np 
a=np.array(((1,2,3,4,5),(6,7,8,9,0)),dtype=np.int32) 
print(a) 
print(a.ndim) 
print(a.shape) 
print(a.size) 
print(a.dtype) 
print(a.itemsize) 
print(a.data)
[[1 2 3 4 5]
 [6 7 8 9 0]] 
2
(2, 5) 
10
int32 
4
<memory at 0x000002096EBBFEE0>

3.2.1 .ndim attribute

[rank, number of axes, number of dimensions]

  • The .reshape() method can deform and reconstruct the array and adjust the size of each dimension.The original array will not be changed. The new array needs to be assigned to a variable
  • The .resize() method is similar to the reshape() method. resize() directly operates on the original array, sowill change the original array. It is an operation on the original array.
  • In the .reshape() method, you can specify certain parameter values, and the remaining parameters can be assigned a value of -1, and the system will calculate the ungiven parameter values ​​based on the given parameters.
import numpy as np 
arr=np.array([1,2,3,4,5,6]) 
print('原数组为:') 
print(arr) b=arr.reshape(2,3) 
print('原数组为:') 
print(arr) 
print('b数组为') 
print(b) # resize()是对原数组的一个操作
arr.resize(3,2) 
print('原数组为:') 
print(arr) 
c=arr.reshape(2,-1) 
print('原数组为:') 
print(arr) 
print('c数组为') 
print(c)
原数组为: 
[1 2 3 4 5 6] 
原数组为: 
[1 2 3 4 5 6] 
b数组为 
[[1 2 3] 
 [4 5 6]] 
原数组为: 
 [[1 2] 
  [3 4] 
  [5 6]] 
原数组为: 
[[1 2]
 [3 4]
  [5 6]] 
 c数组为 
 [[1 2 3]
  [4 5 6]]

3.2.2 .shape attribute

  • You can return the size of each dimension of the array,The result is a tuple
  • You can also change the dimension size of the array by assigning values ​​to it. The same as .resize(), which operates directly on the original array.
import numpy as np
a=np.array([[1,2,3],[4,5,6]]) 
print('数组a为') 
print(a) 
print('数组a的维度为:') 
print(a.shape) 
a.shape=(3,2) 
print('数组a为:') 
print(a)
数组a为 
[[1 2 3]
 [4 5 6]] 
数组a的维度为: 
(2, 3) 
数组a为:
[[1 2]
 [3 4]
 [5 6]]

3.2.3 .size attribute

  • The result is an integer equal to the total number of array elements
  • Its value is equal to n×m in the .shape attribute.

3.3 Creating special arrays

3.3.1 .empty() function

  • Similar to the .array() function,Create an empty array of specified shape and data type
  • This array == is not initialized and its content is empty. == The value of the array elements is uncertain because the space used is not initialized.
import numpy as np 
a=np.empty(3,dtype=int) 
print(a)
[3014704 49 5177420]

3.3.2 .zeros() function, ones() function

  • Similar to the array() function,Create an array of all 0s or all 1s of the specified shape and data type
  • The parameter can be a number or a tuple
  • The parameter is a number, and a 1-dimensional array is created.
  • The parameter is a tuple, and an array of corresponding dimensions is created based on the number of elements in the tuple.
import numpy as np 
a=np.ones(5) 
print(a) 
b=2*np.ones((5,1)) 
print(b) 
c=np.zeros((3,2)) 
print(c)
[1. 1. 1. 1. 1.]
 [[2.] 
  [2.] 
  [2.] 
  [2.] 
  [2.]] 
 [[0. 0.]
  [0. 0.]
  [0. 0.]]

3.3.3 Functions that generate sequence of numbers

  • The range() function is provided in python, and the arange() and linspace() functions in Numpy are used to generate a series of numbers.
  1. range() function in Python,Generate arithmetic sequence
  • By specifying the start value, end value and step size, you can create a series of numbers [Note: The generated number is not a list or an array]
  • If only one parameter is set, it starts from 0 by default and the step size is 1. The step size needs to be an integer.
  • These numbers do not include terminal values
arange() function, creates an arithmetic sequence
  • The function is similar to the range() function, which generates an array
  • The starting and ending values ​​and step size may not be integers.
# 构造二维数组 
import numpy as np 
d=np.array([np.arange(1,3),np.arange(4,6)]) 
print(d)
[[1 2]
 [4 5]]
linspace() function
  • Arrays can be created by specifying the start value, end value, and number of array elements.
  • The generated array includes final values ​​by default.
import numpy as np 
a=np.linspace(3,10,5) 
print(a)
[ 3. 4.75 6.5 8.25 10. ]

3.4 Generate random number array

function describe
seed(n) Determine the seed of the random number generator. After the seed is determined, the generated ones will be the same.
random() Generate a random floating point number between 0 and 1. Parameter settings: 1) no parameters; 2) 1 number; 3) a tuple
rand(d0,d1,…dn) Generate random floating point numbers between 0 and 1 that are uniformly distributed. Parameter settings: 1) no parameters; 2) 1 number; 3) multiple numbers
randn(d0,d1,…dn) Generate floating point numbers that conform to the standard normal distribution. The standard normal distribution is a normal distribution with an expectation of 0 and a variance of 1. Parameter settings: 1) no parameters; 2) 1 number; 3) multiple numbers
normal(loc,scale,size) Randomly generates a set of normally distributed floating point numbers based on the given mean and variance. The parameter loc determines the mean, the parameter scale determines the standard deviation, and the parameter size determines the quantity. In the parameter settings, the first two are expectation and variance, size value: 1) no parameters; 2) 1 number; 3) a tuple
randint(low[,high,size,dtype]) Generate a random integer between [low, high]. Parameter settings: 1) 1 number; 2) 2 numbers; 3) 3 numbers; 4) 2 numbers plus a tuple

3.4.1. random() function generates random decimals

# 无参数,生成1个随机小数 
import numpy as np 
a=np.random.random() 
print('无参数,生成一个随机小数:\n',a) # 1个参数是数字,生成一维数组,数组中元素个数是参数值 
b=np.random.random(3) 
print('参数是一个数,生成一维数组构成的随机小数:\n',b) # 1个参数是元组,生成多维数组 
c=np.random.random((2,3,4)) 
print('参数是一个元组,生成多维数组构成的随机小数:\n',c)
无参数,生成一个随机小数: 0.04309735620499444 
参数是一个数,生成一维数组构成的随机小数: [0.87991517 0.76324059 0.87809664] 
参数是一个元组,生成多维数组构成的随机小数: 
[[[0.41750914 0.60557756 0.51346663 0.59783665] 
[0.26221566 0.30087131 0.02539978 0.30306256] 
[0.24207588 0.55757819 0.56550702 0.47513225]]
[[0.29279798 0.06425106 0.97881915 
0.33970784] [0.49504863 0.97708073 
0.44077382 0.31827281] [0.51979699 
0.57813643 0.85393375 0.06809727]]]

3.4.2. rand() function

  • The rand(d0,d1...,dn) function is used to generate random decimals consistent with uniform distribution, ranging from [0,1)
  • The randn(d0,d1...,dn) function is used to generate random decimals that conform to the normal distribution. The data is mainly around the interval (-1, 1)
# 无参数,生成1个随机小数 
import numpy as np 
a=np.random.rand() 
print(a) # 1个参数,生成一维数组,数组中元素个数是参数值 
b=np.random.rand(3) 
print(b) # 多个参数,生成多维数组 
b=np.random.rand(2,3,4) 
print(b)
0.037094413234407875 
[0.35065639 0.56319068 0.29972987] 
[[[0.51233415 0.67346693 0.15919373 0.05047767] 
 [0.33781589 0.10806377 0.17890281 0.8858271 ]
 [0.36536497 0.21876935 0.75249617 0.10687958]] 
 [[0.74460324 0.46978529 0.59825567 0.14762019]
  [0.18403482 0.64507213 0.04862801 0.24861251]
  [0.54240852 0.22677334 0.38141153 0.92223279]]]

3.4.3 randint() function

# 1个参数,生成1个0~该参数间的随机整数 
import numpy as np 
a=np.random.randint(5) 
print(a) # 2个参数,生成1个0参数间的随机整数 
b=np.random.randint(3,10) 
print(b) # 3个参数,第3个参数是1个数,生成一维数组 
b=np.random.randint(3,10,3) 
print(b) # 3个参数,第3个参数是元组,生成多维数组 
b=np.random.randint(3,10,(2,3)) 
print(b)
3
9
[5 4 7] 
[[4 4 6]
 [9 4 3]]

3.4.4 choice() function

  • Randomly extract elements from a given one-dimensional array to form an array [one-dimensional or multi-dimensional] numpy.random.choice(a,size=None,p=None)
  • Parameter a can be an integer or a sequence. When a is a number, the sequence of np.arange(a) is generated
  • The parameter size specifies the result of the extracted data. If it is a number, randomly select the specified number of elements to form a one-dimensional
    array. If it is a tuple, generate a multidimensional array of the shape specified by the tuple.
  • Parameter p is used to specifyThe probability that an element in a is drawn, its length needs to be consistent with the length of parameter a; parameter p is probability, and the sum of the data in p should be 1
# 参数a为一个整数,size也是整数,指定各数被抽中的几率 
a=np.random.choice(5,3,p=(0.8,0.2,0,0,0)) 
print(a)
[0 0 1]

3.5 Slicing, iteration and indexing

3.5.1 Index

  • Index is used to obtain data in the array. The subscript starts from 0, and the subscript of the last element is -1
  • Get a single element, the result is a number
  • To obtain multiple elements listed at the index, each index needs to be enclosed in [], and the result is a one-dimensional array.
  • Array indexes can be divided into two categories: one-dimensional array indexes and two-dimensional array indexes.
# 二维数组 
import numpy as np 
a=np.arange(12).reshape(3,-1) 
print('原数组:\n',a) 
# 获取单个元素 
print('获取单个元素:\n',a[-1,3]) 
# 获取多个非连续元素,获取0行2列和1行3列数据 
print('获取多个非连续元素:\n',a[[0,1],[2,3]]) 
# 获取一行 
print('获取一行:\n',a[2]) 
# 获取多行,并重新排序。也称作花式索引 
print('获取多行,并重新排序:\n',a[[2,0]])
# 获取一列 
print('获取一列:\n',a[:,2]) 
# 获取多列 
print('获取多列:\n',a[:,[0,2]]) 
# 获取多行多列 
print('获取多行多列:\n',a[1:,1:])
原数组: 
[[ 0 1 2 3]
 [ 4 5 6 7]
 [ 8 9 10 11]] 
获取单个元素: 
11 
获取多个非连续元素:
[2 7] 
获取一行: 
[ 8 9 10 11] 
获取多行,并重新排序: 
[[ 8 9 10 11]
 [ 0 1 2 3]] 
获取一列: 
[ 2 6 10] 
获取多列: 
[[ 0 2]
 [ 4 6]
 [ 8 10]] 
获取多行多列: 
[[ 5 6 7]
 [ 9 10 11]] 
获取不连续的多个元素: 
[ 1 10]

3.5.2 Slicing

  • Slicing refers to the operation of taking a continuous part of a data sequence object
  • Use colons to separate the same dimensions, and use commas to separate each dimension.
  • There can be one or two colons between the same dimensions.
  • The first parameter is the starting index, the second parameter is the ending index, and the third parameter is the step size.
  • Omitting the start index or end index means starting from the beginning or going to the end. If the step size is omitted, the step size is 1
import numpy as np 
arr=np.arange(24).reshape(4,-1) 
print(arr) 
arr1=arr[1:,:3] 
print(arr1)
[[ 0 1 2 3 4 5]
 [ 6 7 8 9 10 11]
 [12 13 14 15 16 17] 
 [18 19 20 21 22 23]] 
 [[ 6 7 8]
  [12 13 14] 
  [18 19 20]]
  • The essence of slicing is to pass in a continuous list
  • Obtaining a non-continuous list means obtaining the elements at the corresponding positions by passing in the corresponding index to form a one-dimensional array.

3.5.3 Iteration

  • Iteration through for loop, when there are more than one dimensions, the iteration operation uses nested for loops
  • .flat is an array element iterator, giving the storage address of the array after it is flattened.
  • This property can be used when iterating over each element in an array
import numpy as np 
a=np.arange(9).reshape(3,3) 
print('原数组为:') 
print(a) 
print('原数组各行为:') 
for row in a: 
	print(row) 
print('迭代数组元素:') 
print('数组展平后的起始地址为:\n',a.flat) 
for ele in a.flat: 
	print(ele,end=',')
原数组为: 
[[0 1 2]
 [3 4 5] 
 [6 7 8]] 
原数组各行为: 
 [0 1 2]
  [3 4 5] 
  [6 7 8] 
迭代数组元素: 
数组存储的起始地址为:
<numpy.flatiter object at 0x0000011D5C1E2560> 
0,1,2,3,4,5,6,7,8,
  • The .ravel() method is used to expand array elements and convert multi-dimensional ndarray into one-dimensional ndarray.
  • Expand horizontally by row without affecting the original array
  • The .flatten() method can also flatten ndarray, but this method can choose to flatten horizontally or vertically.
  • The default is horizontal flattening. Set parameter 'f' to achieve vertical flattening without affecting the original array.

3.6 Combination, segmentation and search

3.6.1 Combining ndarray

  • Some data is distributed in different files and needs to be combined vertically and vertically
  • Some data features are distributed in different files and need to be combined horizontally and horizontally.
  • Numpy provides horizontal combination and vertical combination functions
  • Completed using hstack (cloumn_stack), vstack (row_stack), concatenate, and dstack functions respectively
  1. Horizontal combination np.hstack((data1,data2)), np.column_stack((data1,data2))
  • The parameter is a tuple, and the array to be combined is the element of the tuple.
  • Each merged array is required to have the same number of rows

2. Vertically combine np.vstack((data1,data2)), np.row_stack((data1,data2))

  • The parameter is a tuple, and the array to be combined is the element of the tuple.
  • It is required that each array to be merged must have the same number of columns.

3 Horizontal combination and vertical combination np.concatenate((data1,data2),axis=0/1)

  • There are 2 parameters. The first parameter is a tuple consisting of the arrays that need to be combined.
  • The second parameter is used to set the axis along which the combination is performed. axis=1 means horizontal combination [horizontal expansion], axis=0 means vertical combination [vertical expansion]

3.6.2 Split ndarray

  • The purpose is to split an array into multiple arrays
  • Can be used in NumpyThe hsplit, vsplit, and split functions implement horizontal and vertical splits
  • Through these functions, the array can be divided into sub-ndarrays of the same size, or can be divided into target shapes based on position.
  • The dimensions of the divided array are the same as the original ndarray, that is, if the original array is two-dimensional, the divided array will also be a two-dimensional array.
3.6.2.1 Horizontal split np.hsplit() function

What is split is a tuple of multiple arrays, which can be unpacked into a single array.

  • There are 2 types of parameters,One is a single number, indicating how many arrays it is split into on average, and the other is a tuple, which is the column subscript value of the array split by the elements of the tuple.
  • hsplit(data,n) means dividing the original array into n equal parts horizontally. Therefore, the number of columns of the array should be divisible by n.
  • hsplit(data,(m,n,x...)) means splitting at the subscripts m, n, x..., etc., and the number of subarrays obtained is the value of the parameter + 1. The column where the subscript is located belongs to the next array.
# 平均分 
import numpy as np 
arr1=np.arange(12).reshape(-1,3) 
print('原数组1:\n',arr1) 
print('-'*20) 
arr2,arr3,arr4=np.hsplit(arr1,3) 
print('分割后的1:\n',arr2) 
print('-'*20) 
print('分割后的2:\n',arr3) 
print('-'*20) 
print('分割后的3:\n',arr4)
原数组1[[ 0 1 2]
  [ 3 4 5] 
  [ 6 7 8] 
  [ 9 10 11]] 
-------------------- 
分割后的1[[0]
   [3] 
   [6] 
   [9]] 
-------------------- 
分割后的2[[ 1] 
   [ 4] 
   [ 7] 
   [10]] 
-------------------- 
分割后的3[[ 2] 
   [ 5] 
   [ 8] 
   [11]]
3.6.2.2 Vertical split np.vsplit() function
  • What is split is a tuple of multiple arrays, which can be unpacked into a single array.
  • There are 2 types of parameters,One is a single number, indicating how many arrays it is split into on average, and the other is a tuple, which is the row subscript value of the array split by the elements of the tuple.
  • vsplit(data,n) means dividing the original array into n equal parts vertically. Therefore, the number of rows of the array should be divisible by n.
  • vsplit(data,(m,n,x...)) means splitting at the subscripts m, n, x..., etc., and the number of subarrays obtained is the value of the parameter + 1. The column where the subscript is located belongs to the next array
3.6.2.3 Horizontal split and vertical split np.split() function
  • There are two situations for parameter setting.
  • The parameters are the array to be split, the number of shares to be divided equally [integer] and the axis along which to split
  • The parameters are the array to be split, specifying the position of the split point [a tuple consisting of column numbers] and which axis to split along.
  • Along which axis:
    • axis=0 means vertical division [vertical becomes smaller]
    • axis=1 means horizontal division [horizontally smaller]
# 横向平均分割【横向变小】 
import numpy as np 
arr1=np.arange(12).reshape(3,-1) 
print('原数组1:\n',arr1) 
print('-'*20) 
arr2,arr3=np.split(arr1,2,axis=1) 
print('分割后1:\n',arr2) 
print('-'*20) 
print('分割后2:\n',arr3)
原数组1[[ 0 1 2 3]
 [ 4 5 6 7] 
 [ 8 9 10 11]] 
-------------------- 
 分割后1[[0 1]
   [4 5] 
   [8 9]]
-------------------- 
分割后2[[ 2 3]
 [ 6 7] 
 [10 11]]
3.6.2.4 Uneven horizontal or vertical split np.array_split() function
  • Parameter settings are similar to the split() function

3.6.3 Search

  • Numpy provides a number of functions for performing searches within ndarrays, includingMaximum value, minimum value, and elements that meet given conditionsThe function.
  1. Use the ==np.argmax() and np.argmin()== functions
  • You can find the index of the maximum and minimum values
    ​​np.argmax(array,axis=not specified/0/1)
    np.argmin(array,axis=not specified/0/1)
  • If axis is omitted, the array will be arranged in rows.Numbering starts from 0 and returns the index of the element that meets the condition. The result is an integer.
  • axis=0, search by column, find the index of the element that satisfies the condition in each column, and obtain a one-dimensional array composed of the indexes of each column. The indexes of each column start from 0.
  • axis=1, search by row, find the index of the element that satisfies the condition in each row, and obtain a one-dimensional array composed of the indexes of each row. The indexes of each row start from 0.
import numpy as np 
arr1=np.array([[1,2,3],[4,8,0],[9,5,6],[13,10,11]]) 
print('原数组:\n',arr1) 
print('-'*20) 
print('最大元素索引',np.argmax(arr1)) 
print('-'*20) 
print('各列最大索引',np.argmax(arr1,axis=0)) 
print('-'*20) 
print('各行最大索引',np.argmax(arr1,axis=1))
原数组: 
[[ 1 2 3]
 [ 4 8 0] 
 [ 9 5 6] 
 [13 10 11]] 
-------------------- 
最大元素索引 
9 
-------------------- 
各列最大索引 
[3 3 3] 
-------------------- 
各行最大索引 
[2 1 0 0]
3.6.3.1 np.where() function
  • Returns the index np.where(condition,x,y) of the element in the input ndarray that meets the given condition

  • Case 1: Only condition [judgment condition], no parameters x and y. Such as: np.where(arr>5)

  • Output the subscript of the element that meets the condition,The result is a tuple of multiple arrays

  • You can unpack the array to obtain multiple arrays. The first array element corresponds to the row number of the element that meets the condition.

  • The elements of the second array correspond to the column numbers of the elements that meet the conditions, and the third array is the layer...

  • Case 2: When the condition is met, the corresponding element in the first array is taken. When the condition is not met, the corresponding element in the second array is taken. Such as: np.where(arr1>5, arr1, arr2)

  • Output elements that meet the conditions. The result is an array. The array dimensions are the same as the original array.

  • The elements in the array are taken from arr1 and arr2

3.6.3.2 np.extract() function
  • The return value is a one-dimensional array. The array elements are composed of elements in the input ndarray that meet the given conditions np.extract(condition,arr)

4.Numpy calculation

4.1 Arithmetic operations

operator function format illustrate
+ np.add(x,y) Sum each element in ndarray x and elements in y, and the return value is ndarray
- np.substract(x,y) Find the difference between each element in ndarray x and an element in y, and the return value is ndarray
* np.multiply(x,y) Each element in ndarray x is multiplied by an element in y, and the return value is ndarray
/ np.divide(x,y) Calculate the quotient of each element in ndarray x with an element in y, and the return value is ndarray
** np.power(x,y) Each element in ndarray x is exponentiated by an element in y, and the return value is ndarray
  • In addition to the basic four arithmetic operations, Numpy also provides other mathematical operation functions, such as:

1.np.negative(x): Find the opposite of each element in ndarray x
2.np.sqrt(x): Find the square root of each element in ndarray x
3.np.absolute(x): Find each element in ndarray x The absolute value of
4.np.fabs(x): Find the absolute value of each element in ndarray x

4.2 Statistical operations

  • Commonly used statistical functions [methods]
  • As a function, taking an array as a parameter, the calling method is np.max()
  • As a method, call it with an array.
  • statistical operations

4.3 Mathematical functions

4.3.1 Trigonometric functions

  • These functions can bePass in list type parameters, automatically calculate each element in the list, no need to write a loop
  • Sine function, etc.: np.sin(), np.cos(), np.tan()
  • Arcsine function, etc.: np.arcsin(), np.arccos(), np.arctan()
  • Radians and angles conversion:np.degrees() converts radians to angles, np.radians() converts angles to radians

4.3.1 Divisors

  • Including rounding, rounding to 0, rounding up and rounding down
  • Rounding: np.around(arr,decimals=0/1/2…), the parameter arr is an array,decimals determine precision
  • Rounding: np.rint(arr)
  • Round to 0: np.fix()
  • Round down: np.floor()
  • Round up: np.ceil()

4.4 Broadcast

  • Two arrays of the same size can directly perform four arithmetic operations, elements at the same position are calculated
  • Data of different sizes can be expanded into arrays of the same size for calculation. This mechanism is called broadcasting.
  • The premise of broadcasting is that the two arrays must be converted into the same dimension size before they can be operated:
  • Rule 1: If the dimensions of two arrays are not the same, thensmall dimension arrayThe shape will be inAdd 1 to the left, and then expand. For example, if the shapes are (3,2,4) and (2,4) added, then the second one is expanded to (1,2,4) by adding 1 on the left, and then expanded to (3,2,4).
  • Rule 2: If the shapes of two arrays do not match in any dimension, then the shape of the array is expanded along dimension 1 to match the shape of the other array.
  • Rule 3: If the shapes of the two arrays do not match in any dimension and neither dimension is 1,This will cause an exception.

Broadcast method:

  1. Two arrays are considered broadcast-compatible if the axis lengths of their trailing dimensions (dimensions starting from the end) match or if one of them has a length of 1. [That is, the two arrays have different dimensions, but the axis lengths of their trailing edge dimensions are consistent; the array dimensions are the same, and one of the axis is 1]
  2. Broadcasting occurs on missing dimensions and/or dimensions with an axis length of 1.
  • Method 1: The array dimensions are different, and the axis lengths of the trailing edge dimensions match.
    • The shape of arr1 is (4,3), and the shape of arr2 is (3,). The former is two-dimensional, while the latter is one-dimensional with different dimensions.
    • Their trailing edge dimensions are equal, the length of the second dimension axis of arr1 is 3, and the length of the first dimension axis of arr2 is also 3.
    • The shape of arr2 adds 1 to the left and becomes (1,3), and the dimensions are the same. Then expand along the 0 axis to become a shape of (4,3).
  • Method 2: The array dimensions are the same, one axis is 1, and the value of the axis that is not 1 is equal to the value of the corresponding axis of the other array.
    • The shape of arr1 is (4,3) and the shape of arr2 is (4,1). They are both two-dimensional.
    • But the length of the second array on axis 1 is 1, so it can be broadcast on axis 1
    • The dimensions of the two arrays must be equal, with one axis having a length of 1, so they will expand along the axis of length 1.

4.5 Remove duplicate data

  • Use functionnp.unique(data,axis=0/1)

4.6 Numpy reads external data files and processes them

  • numpy isIt is impossible to directly determine the numeric data in an array composed of a mixture of numeric values ​​and characters., because the numerical type
  • The numpy array composed of character type is no longer a numerical type array, but dtype='<U21'.

5. Use Numpy to process images

Guess you like

Origin blog.csdn.net/QwwwQxx/article/details/124728695