[Data Analysis Encyclopedia] Python-based Data Analysis Encyclopedia - Numpy Basics


The full name of NumPy is Numeric Python, which is a third-party extension package of Python, mainly used to calculate and process one-dimensional or multi-dimensional arrays.

I. Introduction

  It’s August. In July, because of the project’s needs, I taught myself the content related to deep learning . Now I have built the neural network framework required by the project, and the input and output are also normalized. The simulation error is also added. The parameter extraction and data fitting of the image are also done . Open a new pit -
  this series ( "Python-Based Data Analysis Encyclopedia" ) is written by reading DataCamp《Learn Python for Data Science Interactively》( I personally translated it as "Interactive Learning Python Data Science") I wrote this summary of data analysis experience based on my personal understanding. Ideas and solutions to the problems encountered .
  This article introduces the basics of Numpy , and the full text is implemented in Python . All the codes used can be run on the machine for debugging. If you find any problems, please private message me and attach the error message.


Two, Numpy

  Numpy is the core library for Python data science computing and data analysis related work , providing high-performance multi-dimensional array objects and tools for processing arrays. We can import the Numpy library
insert image description here
  using the following statement:

import numpy as np

Numpy array

insert image description here


2. Create an array

>>> a = np.array([1,2,3])
>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float)
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]],
dtype = float)

Initialize the placeholder

  The following introductionInitialize the placeholder, which mainly includes: create an array with a value of 0, create an array with a value of 1, create an array with a uniform interval (step value), create an array with an even interval (number of samples), create a constant array, create a 2x2 identity matrix, and create a random value. Arrays and related code implementations for creating empty arrays.

>>> np.zeros((3,4)) # 创建值为0数组
>>> np.ones((2,3,4),dtype=np.int16) # 创建值为1数组
>>> d = np.arange(10,25,5) # 创建均匀间隔的数组(步进值)

>>> np.linspace(0,2,9) # 创建均匀间隔的数组(样本数)

>>> e = np.full((2,2),7) # 创建常数数组 
>>> f = np.eye(2) # 创建2x2单位矩阵
>>> np.random.random((2,2)) # 创建随机值的数组
>>> np.empty((3,2)) # 创建空数组

3. Input and output

3.1 Saving and loading text files

>>> np.loadtxt("myfile.txt")
>>> np.genfromtxt("my_file.csv", delimiter=',')
>>> np.savetxt("myarray.txt", a, delimiter=" ")

3.2 Saving and loading files on disk

>>> np.save('my_array', a)
>>> np.savez('array.npz', a, b)
>>> np.load('my_array.npy')

4. Data type

  The following introductiontype of data, including: signed 64-bit integers, standard double-precision floating-point numbers, complex numbers displayed as 128-bit floating-point numbers, Boolean values: True and False values, Python objects, fixed-length strings, fixed-length Unicode .

>>> np.int64 # 带符号的64位整数
>>> np.float32 # 标准双精度浮点数
>>> np.complex #显示为128位浮点数的复数
>>> np.bool # 布尔值:True值和False值
>>> np.object # Python对象
>>> np.string_ # 固定长度字符串
>>> np.unicode_ # 固定长度Unicode

5. Array information

  The following introductionarray informationThe code implementation of the viewing method, including: array shape, several rows and columns, array length, several-dimensional array, number of elements in the array, data type, data type name, and data type conversion .

>>> a.shape # 数组形状,几行几列
>>> len(a) # 数组长度
>>> b.ndim # 几维数组
>>> e.size # 数组有多少元素
>>> b.dtype # 数据类型
>>> b.dtype.name # 数据类型的名字
>>> b.astype(int) # 数据类型转换

6. Call help

>>> np.info(np.ndarray.dtype)

Seven, array calculation

7.1 Arithmetic operations

  The following introductionArithmetic related operations, mainly including: subtraction, subtraction with another solution, addition, addition, division, division, multiplication, multiplication with another solution, power, square root, sine, cosine, natural logarithm, dot product .

# 减法
>>> g = a - b
array([[-0.5, 0. , 0. ],[-3. , -3. , -3. ]])

# 减法
>>> np.subtract(a,b)

# 加法
>>> b + a
array([[ 2.5, 4. , 6. ],[ 5. , 7. , 9. ]])
>>> np.add(b,a) # 加法

# 除法
>>> a / b
array([[ 0.66666667, 1. , 1. ],[ 0.25 , 0.4 , 0.5 ]])
>>> np.divide(a,b) # 除法

# 乘法
>>> a * b
array([[ 1.5, 4. , 9. ],[ 4. , 10. , 18. ]])
>>> np.multiply(a,b) # 乘法

# 幂
>>> np.exp(b)

# 平方根
>>> np.sqrt(b)

# 正弦
>>> np.sin(a)

# 余弦
>>> np.cos(b)

# 自然对数
>>> np.log(a)

# 点积
e.dot(f)
array([[ 7., 7.],[ 7., 7.]])

7.2 Comparison

  The following introductionCompareRelevant code implementation of , mainly including: comparison value, another solution for comparison value, and comparison array .

# 对比值
>>> a == b
array([[False, True, True],[False, False, False]], dtype=bool)

# 对比值
>>> a < 2
array([True, False, False], dtype=bool)

# 对比数组
>>> np.array_equal(a, b)

7.3 Aggregate functions

  The following introductionaggregate functionRelevant code implementation of , mainly including: array summary, array minimum value, array maximum value (row-by-row operation), cumulative value of array elements, average, median, correlation coefficient, and standard deviation .

>>> a.sum() # 数组汇总
>>> a.min() # 数组最小值
>>> b.max(axis=0) # 数组最大值,按行
>>> b.cumsum(axis=1) # 数组元素的累加值
>>> a.mean() # 平均数
>>> b.median() # 中位数
>>> a.corrcoef() # 相关系数
>>> np.std(b) # 标准差

Eight, array copy

  The following introductionarray copyRelevant code implementations of , mainly include: create an array view using the same data, create a copy of the array, and create a deep copy of the array .

>>> h = a.view() # 使用同一数据创建数组视图
>>> np.copy(a) # 创建数组的副本
>>> h = a.copy() # 创建数组的深度拷贝

Nine, array sorting

  The following introductionarray sortThe related operations mainly include: sorting arrays, and sorting arrays based on axes .

>>> a.sort() # 数组排序
>>> c.sort(axis=0) # 以轴为依据对数组排序

10. Subset, slice, index related implementation

10.1 Subsets

insert image description here
  Select the value corresponding to index 2:

>>> a[2]
3

insert image description here
  Select the value corresponding to row 1 column 2 (equivalent to b[1][2]):

>>> b[1,2]
>6.0

10.2 Slicing

insert image description here
  Select the value corresponding to index 0 and 1:

>>> a[0:2]
array([1, 2])

insert image description here
  Select the values ​​in row 0, row 1 in column 1:

>>> b[0:2,1]
array([ 2., 5.])

insert image description here
  Select all values ​​in row 0 (equivalent to b[0:1,:1]:

>>> b[:1]
array([[1.5, 2., 3.]])

  Equivalent to [1,:,:]:

>>> c[1,...]
array([[[ 3., 2., 1.],[ 4., 5., 6.]]])

  Reverse the array a:

>>> a[ : :-1]
array([3, 2, 1])

10.3 Conditional indexes

insert image description here
  Select all values ​​less than 2 in array a:

>>> a[a<2]
array([1])

10.4 Fancy indexing

  Select the values ​​corresponding to (1,0), (0,1), (1,2) and (0,0):

>>> b[[1, 0, 1, 0],[0, 1, 2, 0]]
array([ 4. , 2. , 6. , 1.5])

  Select a subset of rows and columns of a matrix:

>>> b[[1, 0, 1, 0]][:,[0,1,2,0]]
array([[4.,5.,6.,4.],[1.5,2.,3.,1.5],[4.,5.,6.,4.] ,[1.5,2.,3.1.5]])

11. Array operations

11.1 Transposing an array

>>> i = np.transpose(b) # 转置数组
>>> i.T # 转置数组

11.2 Changing the shape of an array

  The following introduces the relevant code operations to change the shape of the array, mainly including: sorting the array, and sorting the array based on the axis .

>>> b.ravel() # 拉平数组
>>> g.reshape(3,-2) # 改变数组形状,但不改变数据

11.3 Adding or removing values

  The following introduces the relevant code operations for adding or deleting values, mainly including: returning a new array with a shape of (2,6), appending data, inserting data, and deleting data .

>>> h.resize((2,6)) # 返回形状为(2,6)的新数组
>>> np.append(h,g) # 追加数据
>>> np.insert(a, 1, 5) # 插入数据
>>> np.delete(a,[1]) # 删除数据

11.4 Merging Arrays

  The following introductionmerge arrayRelevant code implementations, mainly include: splicing arrays, stacking arrays vertically by row dimensions, vertically stacking arrays by row dimensions, horizontally stacking arrays by column dimensions, and creating stacked arrays by column dimensions .

# 拼接数组
>>> np.concatenate((a,d),axis=0)
array([ 1, 2, 3, 10, 15, 20])

# 纵向以行的维度堆叠数组
>>> np.vstack((a,b))
array([[ 1. , 2. , 3. ],[ 1.5, 2. , 3. ],[ 4. , 5. , 6. ]])

# 纵向以行的维度堆叠数组
>>> np.r_[e,f]

# 横向以列的维度堆叠数组
>>> np.hstack((e,f))
array([[ 7., 7., 1., 0.],[ 7., 7., 0., 1.]])

# 以列的维度创建堆叠数组
>>> np.column_stack((a,d))
array([[ 1, 10],[ 2, 15],[ 3, 20]])

# 以列的维度创建堆叠数组
>>> np.c_[a,d]

11.5 Splitting Arrays

  The following introductionsplit arrayExamples of related code implementations mainly include: dividing the array vertically into 3 equal parts and dividing the array horizontally into 2 equal parts .

# 纵向分割数组为3等份
>>> np.hsplit(a,3)
[array([1]),array([2]),array([3])]

# 横向分割数组为2等份
>>> np.vsplit(c,2)
[array([[[ 1.5, 2. , 1. ],[ 4. , 5. , 6. ]]]),
array([[[ 3., 2., 3.],[ 4., 5., 6.]]])]

Guess you like

Origin blog.csdn.net/m0_65748531/article/details/132102685