I have been preparing for a long time and want to write some articles about data analysis, but I have been busy with work and I am busy updating an article about numpy.
This article mainly introduces the application of numy, the use of some commonly used functions, and the operations on arrays and matrices:
1. What is numpy?
NumPy is a powerful Python library mainly used to perform calculations on multi-dimensional arrays. NumPy provides a large number of library functions and operations, which help programmers perform numerical calculations easily. This type of numerical calculation is widely used for the following tasks:
-
Machine learning model: When writing machine learning algorithms, you need to perform various numerical calculations on the matrix. For example, matrix multiplication, transposition, addition, etc. NumPy provides a very good library for simple (in terms of writing code) and fast (in terms of speed) calculations. NumPy arrays are used to store training data and machine learning model parameters.
-
Image processing and computer graphics: The image in the computer is represented as a multi-dimensional digital array. NumPy becomes the most natural choice in the same situation. In fact, NumPy provides some excellent library functions to quickly process images.
-
Math tasks: NumPy is very useful for performing various mathematical tasks, such as numerical integration, differentiation, interpolation, and extrapolation. Therefore, when it comes to mathematical tasks, it forms a quick alternative to Python-based MATLAB.
Second, the function in numpy
1、np.array ()
array: The array module that comes with python only supports the creation of one-dimensional arrays , not multi-dimensional arrays.
np.array: np.array in numpy makes up for the shortcomings of array. The dtype parameter specifies the type of elements in the array.
import numpy as np
# 常规数组创建方法
a1 = np.array([1, 2, 4])
a2 = np.array([[1.0,2.0],[3.0,4.0]])
a3 = np.array([[1,2],[3,4]],dtype=complex) # 指定数据类型,complex复数类型
print(a1,"\n.a2.\n",a2,"\n.a3.\n", a3)
"""
运行结果:
[1 2 4]
.a2.
[[1. 2.]
[3. 4.]]
.a3.
[[1.+0.j 2.+0.j]
[3.+0.j 4.+0.j]]
"
""
2、np.arange ()
arange function: used to create an arithmetic sequence array
np.arange([start, ]stop, [step, ]dtype=None)
-
start : can be ignored and not written, the default starts from 0; starting value
-
stop: the end value; the generated element does not include the end value
-
step: can be ignored, the default step length is 1; step length
-
dtype: Default is None, set the data type of the display element
nd1 = np.arange(5) #array([0, 1, 2, 3, 4]) 随机生成5个自然数
nd2 = np.arange(1,5) #array([1, 2, 3, 4])
# nd3:1为起点,间隔为2,5为终止点但取不到,(左开右闭取值)
nd3 = np.arange(1,5,2)
print(nd1)
print(nd2)
print(nd3)
# 利用字符编码创建数组
# 创建一个单精度浮点数组
m3 = np.arange(7, dtype='f')
print(m3)
# 创建复数数组
m4 = np.arange(7, dtype='D')
print(m4)
"""
运行结果:
nd1: [0 1 2 3 4]
nd2: [1 2 3 4]
nd3: [1 3]
m3: [0. 1. 2. 3. 4. 5. 6.]
m4: [0.+0.j 1.+0.j 2.+0.j 3.+0.j 4.+0.j 5.+0.j 6.+0.j]
"""
3、np.reshape()
np.reshape(): Give the array a new shape, and the element values in the array remain unchanged
np.reshape(a, newshape, order='C) parameter description:
-
a: the array to be converted
-
newshape: integer value or integer tuple. The new shape should be compatible with the original shape. If it is an integer value, it represents the length of a one-dimensional array; if it is a tuple, an element value can be -1, then the element value is represented as specified, and it will be inferred from the length of the array and the remaining dimensions
-
order: optional (ignored)
-
a = np.arange(15) b = a.reshape(3, 5) print(a) print("转换后的数组:",b) #将一维数组转换成为3行5列 """ 运行结果: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14] 转换后的数组: [[ 0 1 2 3 4] [ 5 6 7 8 9] [10 11 12 13 14]] """
4 、 np.ndim ()
np.ndim(): ndim returns the dimension of the array, only one number is returned
# ndim返回的是数组的维度,返回的只有一个数,该数即表示数组的维度
print("a的维度:",a.ndim)
print("b的维度:",b.ndim)
"""
运行结果:
a的维度:1
b的维度:2
"""
5、np.shape()
np.shape(): A tuple representing the size of each bit. What is returned is a tuple
a = np.arange(15)
b = a.reshape(3, 5)
c = np.array([[[1, 4,6],[2, 5,7]],[[5,2,1],[4, 5,7]]])
print("a,b,c的维度:",a.ndim,b.ndim,c.ndim)
print("数组a:",a)
print("数组b:",b)
print("数组c:",c)
# shape:表示各位维度大小的元组。返回的是一个元组
print("a的维度的大小:",a.shape) # 查看各维度大小的元组
print("b的维度的大小:",b.shape)
print("c的维度的大小:",c.shape)
"""
运行结果:
a,b,c的维度:1 2 3
数组a: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14]
数组b:[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
数组c: [[[1 4 6]
[2 5 7]]
[[5 2 1]
[4 5 7]]]
a的维度的大小: (15,)
b的维度的大小: (3, 5)
c的维度的大小: (2, 2, 3)
"""
For one-dimensional arrays: The question is why it is not (1, 15), because the dimension of a.ndim is 1, and only one number is returned in the tuple.
For a two-dimensional array: the front is a row, the back is a column, and its ndim is 2, so two numbers are returned.
For a three-dimensional array: it is difficult to see, for the c printed above, see what structure it is.
数组c:
[[[1 4 6]
[2 5 7]]
[[5 2 1]
[4 5 7]]]
First look at the outer brackets, including [[1,4,6],[2,5,7]] and [[5,2,1],[4,5,7]], assuming they are arrays respectively A, B, get [A, B], if A, B is just a number, its ndim is 2, which is the first number. But A and B are arrays of (2, 3). So combined, this is the shape of c, which is (2, 2, 3).
By analogy with this method, the shapes of 4-dimensional and 5-dimensional arrays can be derived.
6 、 e.g. dtype ()
np.dtype(): An object used to describe the data type of an array. What is returned is the data type of the array.
print("数据类型:",a.dtype) # 查看数据类型
"""
运行结果:
数据类型: int64
"""
Since the data in the figure are all integers, all returned are int64. If there is data in the array with a decimal point, then float64 will be returned.
Someone may ask: Shouldn't the plastic data be int? Shouldn't floating point data be float?
Answer: int32 and float64 are a set of data types of the Numpy library.
7 、 e.g.astype ()
np.astype(): Convert the data type of the array
vec_1 = np.array(['1','2','3'])
vec_2 = vec_1.astype('float')
print("转换前的类型:",vec_1.dtype)
print(vec_1)
print("转换后的类型:",vec_2.dtype)
print(vec_2)
"""
运行结果:
转换前的类型:<U1
['1' '2' '3']
转换后的类型:float64
[1. 2. 3.]
"""
Note: The float is a built-in type of python, but Numpy can be used. Numpy maps the Python type to the equivalent dtype.
There are other type conversions (you can test by yourself):
i nt32 --> float64 completely ok
float64 --> int32 will truncate the decimal part
string --> float64 If the string array represents all numbers, you can also use astype to convert to a numeric type
8 、 np.min (), np.max ()
np.min(): Get the smallest element in the array
np.max(): Get the smallest element in the array
print("获取数组中最小的值:",b.min())
print("获取数组中最大的值:",b.max())
"""
运行结果:
获取数组中最小的值: 0
获取数组中最大的值: 14
"""
9、np.zeros(), np.ones()
np.zeros(): Set an array with all 0 elements, and return an array filled with 0 of the given shape and type
np.ones(): Set an array with all 1 elements, and return an array group filled with 2 of a given shape and type
zeros(shape, dtype=float, order='C'):
ones(shape, dtype=float, order='C'):
-
shape : set the shape of the array
-
dtype : data type, optional parameter, default is np.float64
-
order : optional parameter (can be ignored)
np.zeros((3,4)) # 设置一个3行4列的0矩阵
np.ones((2,3,4), dtype=np.int32) # 设置一个3行4列的3维的值均为1的矩阵
"""
运行结果:
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]], dtype=int32)
"""
10、np.random()
np.random(): random sampling, generating random data
np.random.rand(d0,d1,....,dn): returns a random sample located between [0,1)
np.random.rand() # 返回0~1中的一个随机值
np.random.rand(3, 2)# 返回3行4列的元素值在0~1随机矩阵
"""
运行结果:
np.random.rand() :0.6759565926081442
np.random.rand(3, 2):
array([[0.50718076, 0.55366315, 0.8955532 ],
[0.78137634, 0.67035372, 0.56846724]])
"""
randn(d0, d1,...dn): returns a sample or multiple sample values, which is a standard normal distribution
np.random.randn()
np.random.randn(2,3)
"""
运行结果:
np.random.randn():-1.723572225201004
np.random.randn(2,3):
array([[ 0.5517019 , 0.94568245, -0.73698193],
[ 0.18349642, -1.63614097, 0.74157234]])
"""
randint(low[, high,size]): Returns a random integer matrix in the range of low<=n<high, in the half-open interval [low, high)
-
size: random integer matrix size
np.random.randint(4, size=8)
np.random.randint(4, size=(2,3))
"""
运行结果:
array([1, 3, 0, 0, 2, 0, 0, 3])
array([[2, 3, 2],
[0, 0, 0]])
"""
np.random.random(size):
Returns [0.0,1.0) random number matrix of specified size, random_sample, ranf, sample are the same.
np.random.random()
np.random.random(size=[2,3])
np.random.random((2,3)) # 设置一个2行3列的值在0~1之间的矩阵
"""
运行结果:
0.7893031677602365
array([[0.78053854, 0.18131081, 0.82077647],
[0.43697461, 0.91715564, 0.05549399]])
array([[0.38284667, 0.94033216, 0.10049719],
[0.08550353, 0.83507381, 0.70694944]])
"""
11 、 e.g. linspace ()
np.linspace(): returns evenly spaced numbers (an array) within the specified range, that is, returns an arithmetic sequence
numpy.linspace(start,stop[,num=50[,endpoint=True[,retstep=False[,
dtype=None]]]]]):
-
start: starting point
-
stop: end point
-
num: the number of elements , the default is 50
-
endpoint: Whether to include the stop value, the default is True, including the stop value; if it is False, the stop value is not included
-
retstep: return value form, default is False, return arithmetic sequence group, if it is True, return result (array([`samples`, `step`])),
-
dtype: The data type of the returned result, default is none, if not, refer to the input data type.
# linspace创建等差数列的函数,10个值从0到90的等差数列
np.linspace(0 , 90, 10)
np.linspace(0 , 90, 10, endpoint=False)
np.linspace(0 , 90, 10, retstep =True) # 返回步长
"""
运行结果:
array([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.])
array([ 0., 9., 18., 27., 36., 45., 54., 63., 72., 81.])
(array([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.]), 10.0)
"""
12、np.ravel(), np.flatten()
Both functions reduce the multi-dimensional array to one dimension, but the difference between the two is to return a copy or return a view.
np.flatten() returns a copy . Modifications to the copy will not affect the original matrix
np.ravel() returns the view , which will affect the original matrix when modified
# 改变数组维度
# print(b)
w1 = np.array([[1 , 2] , [3 , 4]])
w1_f = w1.flatten() # 将数组展平,flatten函数会请求分配内存来保存结果
w1_r = w1.ravel() # 将数组展平, 返回数组的一个视图
print("w1_f",w1_f)
print("w1_r",w1_r)
# 二者的区别
w1_f[0] = 10
print('w1:' , w1)
w1_r[0] = 10
print('w1:' , w1)
"""
运行结果:
w1_f [1 2 3 4]
w1_r [1 2 3 4]
w1: [[1 2]
[3 4]]
w1: [[10 2]
[ 3 4]]
"""
13. np.resize()
np.resize(): modify the dimension of the array
b = np.arange(24).reshape(2,3,4) # 三维坐标
b.shape = (6, 4) # 6X4的多维数组
print(b)
b.resize((2,12))
print(b)
"""
运行结果:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
[[ 0 1 2 3 4 5 6 7 8 9 10 11]
[12 13 14 15 16 17 18 19 20 21 22 23]]
"""
14、np.transpose()
np.transpose(): Transpose matrix
# transpose转置矩阵
b = np.arange(24).reshape(2,3,4) # 三维坐标
b.shape = (6, 4) # 6X4的多维数组
b.transpose()
"""
运行结果:
array([[ 0, 4, 8, 12, 16, 20],
[ 1, 5, 9, 13, 17, 21],
[ 2, 6, 10, 14, 18, 22],
[ 3, 7, 11, 15, 19, 23]])
"""
15, combined array
Horizontal combination:
np.hstack() ,np.concatenate(): tile in the horizontal direction
# 组合数组
a = np.arange(9).reshape(3,3)
b = 2 * a
print("a:",a)
print("b:",b)
# 水平组合
hs = np.hstack((a, b))
con_hs = np.concatenate((a,b), axis=1)
print("hs:",hs)
print("con_hs:",con_hs)
"""
运行结果:
a: [[0 1 2][3 4 5] [6 7 8]]
b: [[ 0 2 4][ 6 8 10] [12 14 16]]
hs: [[ 0 1 2 0 2 4]
[ 3 4 5 6 8 10]
[ 6 7 8 12 14 16]]
con_hs: [[ 0 1 2 0 2 4]
[ 3 4 5 6 8 10]
[ 6 7 8 12 14 16]]
"""
Vertical combination:
np.vstack(), np.concatenate(): stack in the vertical direction
# 垂直组合
vs = np.vstack((a,b))
con_vs = np.concatenate((a,b), axis=0)
print("vs:", vs)
print("con_vs:", con_vs)
"""
运行结果:
a: [[0 1 2][3 4 5] [6 7 8]]
b: [[ 0 2 4][ 6 8 10] [12 14 16]]
vs: [[ 0 1 2][ 3 4 5][ 6 7 8]
[ 0 2 4][ 6 8 10][12 14 16]]
con_vs: [[ 0 1 2][ 3 4 5][ 6 7 8]
[ 0 2 4][ 6 8 10] [12 14 16]]
"""
In-depth combination:
np.dstack(): the corresponding element combination
np.dstack((a, b))
"""
运行结果:
a: [[0 1 2][3 4 5] [6 7 8]]
b: [[ 0 2 4][ 6 8 10] [12 14 16]]
array([[[ 0, 0],
[ 1, 2],
[ 2, 4]],
[[ 3, 6],
[ 4, 8],
[ 5, 10]],
[[ 6, 12],
[ 7, 14],
[ 8, 16]]])
"""
Column combination:
np.column_stack(): corresponding column phase combination
# 列组合
oned = np.arange(4)
print(oned)
twice_oned = 2*oned
print(twice_oned)
np.column_stack((oned, twice_oned))
"""
运行结果:
[0 1 2 3]
[0 2 4 6]
array([[0, 0],
[1, 2],
[2, 4],
[3, 6]])
"""
Row combination:
np.row_stack(): The corresponding rows are combined
# 行组合
np.row_stack((oned, twice_oned))
"""
运行结果:
array([[0, 1, 2, 3],
[0, 2, 4, 6]])
"""
16, the division of the array
Horizontal split: np.hsplit()
# 数组的分割
print(a)
np.hsplit(a, 3) # 水平分割
np.split(a, 3, axis=1) # 等价上式
"""
运行结果:
a: [[0 1 2][3 4 5][6 7 8]]
分割后的数组:
[array([[0],[3],[6]]),
array([[1],[4],[7]]),
array([[2],[5],[8]])]
"""
Vertical split: np.vsplit()
# 垂直分割
np.vsplit(a, 3)
np.split(a, 3, axis=0) # 等价上式
"""
运行结果:
[array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])]
"""
Depth split: np.dsplit()
# 深度分割
c = np.arange().reshape(3, 3, 3)
print(c)
np.dsplit(c, 3)
"""
运行结果:
[[[ 0 1 2] [ 3 4 5][ 6 7 8]]
[[ 9 10 11][12 13 14][15 16 17]]
[[18 19 20] [21 22 23] [24 25 26]]]
[array([[[ 0], [ 3], [ 6]],
[[ 9], [12], [15]],
[[18], [21], [24]]]),
array([[[ 1], [ 4], [ 7]],
[[10], [13], [16]],
[[19], [22], [25]]]),
array([[[ 2], [ 5], [ 8]],
[[11], [14], [17]],
[[20], [23], [26]]])]
"""
17. Operate on complex numbers
# 对复数做操作
b1 = np.array([1.j+1,2.j+3])
print(b1)
re = b1.real # 给出复数数组的实部
ig = b1.imag # 给出复数数组的虚部
print("re:",re)
print("ig:",ig)
"""
运行结果:
[1.+1.j 3.+2.j]
re: [1. 3.]
ig: [1. 2.]
"""
18. Convert array to list
# 数组的转换
# 将数组转换为列表
b1.tolist()
"""
运行结果:
[(1+1j), (3+2j)]
"""
Well, the above are our commonly used numpy functions. I have organized them in time. There are many that I have not written. You can check them in the numpy api manual. I will update some numpy, pandas, matplotlib, and seaborn later. article.
If there are any errors or shortcomings in the article, you can leave a message to point out, thank you!
Note: Scan the QR code below on WeChat and reply "numpy1" in the background to get the code^_^
It is not easy to organize, I hope you can click and watch more~ Thank you! ! !