Summary of commonly used scientific calculation methods in Numpy package

The computing performance of the numpy package is dozens to hundreds of times that of the original python method.

1. Introduce the numpy package:                                                                                                                         

import numpy as np

 2. Create an array:

#定义一个python的List
list = [1,2,3]
#将python的List包装成numpy的array:
array = np.array(list)
#array的类型为numpy.ndarray:
type(array)  #numpy.ndarray
#array数组的维度是:
array.ndim # 1维数组,几维数组就返回几
#array的形状;
array.shape #(1,3) 代表1行3列

#array变形:
array.reshape(3,1) #将array改成3行1列。需要注意的是,reshape参数不能随便写,必须基于原有的数组,做有意义的变换,如何写的参数不对,基于array的原有数据不能进行reshape,会报错。

3. Use np to create test data

#设置随机种子,设置成固定值后,np每次生成的随机数都是固定不变的
np.random.seed(1)
#生成随机数数组
np.random.randint(low = 0,high = 101,size=10)#表示生成从0(low)到100(不好含high)随机数数组,数组长度是10

np.random.randint(low = 0,high = 101,size = (2,5,3))#表示生成从0(low)到100(不好含high)随机数数组,生成的是一个三维数组,第一个维度有两个特征,第二个维度有5个特征,第三个维度有3个指标。可以这么形象的理解: 有两个班,每个班有5个学生,每个学生有3门课程。如何用数组表示两个班每个学生3门课程的成绩呢,就是这个表达式

incides = np.arange(num) #生成0至(num-1)个顺序数字的数组
np.random.shuffle(incides ) #将顺序数组打乱

 4. Commonly used calculation formulas for np

#np中array加减乘除一个数,都是在数组中每个元素上进行加减乘除这个数,如:
array = np.random.randint(low = 0,size = 101,size = (2,3))
#array([[91, 33, 38],[10, 77, 99]])
array+1
#array([[ 92,  34,  39],[ 11,  78, 100]])
array**2 # 平方
#array([[8281, 1089, 1444],[ 100, 5929, 9801]])
np.abs(array) #求每个元素的绝对值
np.sqrt(array) #求每个元素的开平方根
np.sin(array) #求每个元素的sin的值    
array.sum(axis=0) #求array数组求和,axis=0代表求x轴总和,即各行的元素纵向相加,axis=1代表求y轴总和,即横向求和。默认不写则求所有元素的和。
#例如:
a = np.array([[1,2,3],[4,5,6]])
a.sum()  #所有元素求和,21
a.sum(axis=0) # 求x轴的和,即每行的x轴对应相加,为array([5, 7, 9])
a.sum(axis=1) #求y轴的和,即横向相加构成y轴的一个值,为array([ 6, 15])

array.mean()# 求平均值,参数同sum()方法

array.std() #求标准差,参数同sum()

 5. Calculation between vectors

1. The size of the vector (the modulus of the vector)

a:[x1,x2,......xn] formula:

:

#例如: 
array = np.array([[1,2,3],[4,5,6]])
#求各元素的平方和:
a= (array**2).sum(axis=0)
# 再开根号就是array向量的模
np.sqrt(a)



#以上是手动求解,np提供了直接的求模方法:  linalg是线性处理数据的工具包,就是线性代数相关的功能
np.linalg.norm(array)

2. Find the inner product of two vectors

Formula: (a:[x1,x2,...xn] b:[y1,y2...yn]) (Question: How to express a and b of a multi-dimensional array?)

a = np.array([1,2,3])
b = np.array([4,5,6])
#内积的三种写法:
a @ b
a.dot(b)
np.dot(a,b)

The meaning of inner product:

3. Find the cosine similarity of two vectors

It can be seen from the above formula that the value range of cosine similarity is [-1,1], which is a modification of the inner product formula. Cosine similarity is the angle between two vectors. The smaller the angle, the more similar they are.

a = np.array([1,2,3])
b = np.array([4,5,6])
#余弦相似度:
a @ b /(np.linalg.norm(a)*np.linalg.norm(b))

4. Find the Euclidean distance of two vectors

Formula: d = sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 +....)

Represents the distance between two vectors. The smaller the distance, the more similar they are.

a = np.array([4,5,6])
b = np.array([1,2,3])
np.sqrt((a-b)**2)

5. Variance and standard deviation formulas

6. broadcast broadcast mechanism

The addition, subtraction, multiplication and division operations of two vectors of the same dimension can be performed directly on bitwise operations. Whether two vectors of different dimensions can be directly operated depends on whether the low-dimensional vector can be converted into a high-dimensional vector through the broadcast mechanism. The so-called broadcast mechanism is to align the two vectors by copying rows and columns, and then perform addition, subtraction, multiplication and division calculations.

7. Find the inner product of two matrices

As mentioned above, the inner product of two vectors is multiplied and then added at the corresponding positions. So how to find the inner product between two matrices? Two matrices require an inner product. The condition that must be met is that in matrix A (m rows and n columns) and B matrix (n rows and l columns), n=n must be n=n, that is, the columns of one vector must be equal to the rows of the other vector, so The obtained result is m rows and l columns, as shown in the figure below:

Note that this is a method of finding inner products, which should be distinguished from the above broadcast mechanism. The broadcast mechanism is a method for addition, subtraction, multiplication and division, that is, the inner product of two vectors can be found, but addition, subtraction, multiplication and division may not necessarily be found.

8. Transpose a vector (row to column)

Three ways of expression:

A.T
A.transpose()
np.transpose(a=A)

9. Find the inverse of a matrix:

The so-called inverse matrix means that if A is an n-order matrix, if there is another n-order matrix B such that:  AB = BA = , then the square matrix A is said to be invertible, and the square matrix B is said to be the inverse matrix of A. Note that not all matrices have inverse matrices. AB refers to finding the inner product of A and B. E refers to a matrix whose diagonal is 1 and all other elements are 0.

# 求A矩阵的逆矩阵
np.linalg.inv(A)

#验证
np.allclose(np.linalg.inv(A) @ A, np.eye(5)) # True代表两个向量互逆

6. Practice application 

Now there are 50 students, each with 3 course scores. Use np to randomly generate the score array of 50 students:

array = np.random.randint(low = 0,high = 101,size = (50,3))

Looking for the second grade of student No. 3:

array[2,1]

Remove all students who failed the Chinese language test:

#获取所有行所有列的写法:
array[:,:] # 第一个:代表所有行,第二个:代表所有列
#获取语文成绩小于60的学生成绩,也就是所有行是语文成绩小于60,列是所有列,写法如下:
array[array[:,0]<60,:]

Find all students who passed Chinese language but failed English:

array[(array[:,0]>60 & array[:,2]<60),:]

Average score for all students, all grades:

array.mean

Mean score for all students, per course, standard deviation:

array.mean(axis=0)
array.std(axis=0)

Guess you like

Origin blog.csdn.net/qq1309664161/article/details/132684835