Numpy basics and examples - artificial intelligence basics

Article directory

1. Overview of Numpy
2. Basics of numpy

1. Overview of Numpy

1. Advantages

Numpy (Nummerical Python) supplements the numerical computing capabilities that the Python language lacks;
Numpy is the underlying library for other data analysis and machine learning libraries;
Numpy is implemented in a completely standard C language and its operating efficiency is fully optimized (Python appeared in 1989 and was released in 1991);
Numpy is open source and free.

2. numpy history

1995, Numeric, a numerical computing extension to the Python language;
2001, Scipy->Numarray, multi-dimensional array operations;
2005年，Numeric+Numarray->Numpy。
In 2006, Numpy separated from Scipy and became an independent project.

3. The core of Numpy: multidimensional arrays

Code simplicity: reduce loops in Python code
Underlying implementation: thick kernel© + thin interface (Python) to ensure performance.

4. ndarray object in memory

4.1 Metadata

Store the description information of the target array, such as: ndim, shape, dtype, data, etc.
Insert image description here

4.2 Actual data

Complete array data
Store actual data and metadata separately, which on the one hand improves the efficiency of memory space usage and on the other hand reduces the frequency of access to actual data and improves performance.
Characteristics of ndarray array objects

Numpy arrays are homogeneous arrays, i.e. all elements must be of the same data type
The subscript of the Numpy array starts from 0, and the subscript of the last element is the array length minus 1
Characteristics of ndarray array objects
Numpy arrays are homogeneous arrays, i.e. all elements must be of the same data type
The subscripts of Numpy arrays start from 0, and the subscript of the last element is the array length-1

2. Basics of numpy

1. ndarray array

import numpy as np

# 通过array创建ndarray
ary = np.array([1, 2, 3 , 4, 5])
print(ary)
print(type(ary))

# 数组与元素的运算是数组与每个元素分别运算
print(ary+2)
print(ary*2)
print(ary == 3)

# 数组与数组之间的运算 是 对应位置对应计算，数组不等不能计算
print(ary + ary)
print(ary * ary)

# 输出：
# [1 2 3 4 5]
# <class 'numpy.ndarray'>
# [3 4 5 6 7]
# [ 2  4  6  8 10]
# [False False  True False False]
#[ 2  4  6  8 10]
#[ 1  4  9 16 25]

数组Operation of and array and each element; Operations between and corresponding positions元素是分别运算
数组数组是对应计算
数组长度不等不能计算

2. arange、zeros、ones、zeros_like

import numpy as np

ary = np.array([1, 2, 3 , 4, 5])
print(ary)

print(ary + ary)
print(ary * ary)
aryrange = np.arange(1,3)
print(aryrange)
aryrange = np.arange(1,3,0.1)
print(aryrange)

ary = np.zeros(10) # 生成0数组
print(ary)
ary = np.zeros(10, dtype='int64')  # 设置数据类型
print(ary)

ary = np.zeros((2 ,2)) # 生成2*2的矩阵
print(ary)
print(ary.shape)

ary = np.array([1, 2, 3 , 4, 5])  # 拿到一个数组，用0填充
print(np.zeros_like(ary))

# 输出
# [1 2 3 4 5]
# [ 2  4  6  8 10]
# [ 1  4  9 16 25]
# [1 2]
# [1.  1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.  2.1 2.2 2.3 2.4 2.5 2.6 2.7
#  2.8 2.9]

# [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
# [0 0 0 0 0 0 0 0 0 0]

# [[0. 0.]
# [0. 0.]]

# (2, 2)

# [1 2 3 4 5]
# [0 0 0 0 0]

python的range只能生成整数, and arange可生成浮点数
zeros_like gets an array, use 0填充
ones_like is similar to

3. Basic operations on ndarray object attributes

3.1 Modify array dimensions

import numpy as np

ary = np.arange(1, 9)
print(ary)

# 直接修改原始数据的维度
ary.shape = (2, 4)  
print(ary)
print(ary.shape)

# 修改为3维数据
ary.shape = (2, 2, 2)
print(ary)
print(ary.shape)

# [1 2 3 4 5 6 7 8]

# [[1 2 3 4]
# [5 6 7 8]]
# (2, 4)

# [1 2 3 4 5 6 7 8]
# [[[1 2]
#   [3 4]]
# 
#  [[5 6]
#   [7 8]]]
# (2, 2, 2)

possible直接使用shape修改数组target形状

3.2 Modify array element type

ary = np.arange(1, 9)
ary.dtype = "float64"  # 只能修改解析方式，修改数据类型只能用astype
print(ary)

ary = np.arange(1, 9)
c = ary.astype(float)  # 不会修改原始数据，可用一个变量去接收
print(c)

# 输出
# [4.24399158e-314 8.48798317e-314 1.27319747e-313 1.69759663e-313]
# [1. 2. 3. 4. 5. 6. 7. 8.]

修改数组类型When 不可使用dtype, this method can only modify the parsing method and will get an incorrect value
can be usedastype()To modify, this way不会修改原始数据, 可用one新变量to receive

3.3 Size of array

import numpy as np

ary = np.arange(1, 9)
print(ary)

print(ary.shape)
print(ary.size)
print(len(ary))
ary.shape = (2, 4)
print(ary.shape)
print(ary.size)
print(len(ary))

# 输出
# [1 2 3 4 5 6 7 8]
# (8,)
# 8
# 8
# (2, 4)
# 8
# 2

size refers to. The len and size of one-dimensional arrays are the same, but they are different for two-dimensional and multi-dimensional arrays. In two-dimensional arrays When , size refers to the number of the second dimension in the two-dimensional array. 数组元素个数

4. Array element index

import numpy as np

ary = np.arange(1, 9)
ary.shape = (2, 2, 2)
print(ary)

print(ary[0])  # 访问三维数组的第一个二维数组
print(ary[0][0])   # 访问二维数组的第一一维数组
print(ary[0][0][0]) # 访问一维数组的第一个元素

print(ary[0,0,0]) # numpy的全新写法

# 输出

# [[[1 2]
#   [3 4]]
#
#  [[5 6]
#   [7 8]]]

# [[1 2]
#  [3 4]]

# [1 2]

# 1

# 1

5. Numpy internal basic data types

Insert image description here
type character code

5.1 Application cases of basic data type abbreviations

import numpy as np

data = [('zs', [100, 90, 95], 18),
        ('ls', [100, 95, 93], 22),
        ('ww', [98, 98, 98], 20)]

print(data)

ary = np.array(data)
print(ary)

ary = np.array(data, dtype='U2, 3int8, int8')
print(ary)

5.2 Convert the list to an array

Method 1: Specify dtype through string (not commonly used)

import numpy as np

data = [('zs', [100, 90, 95], 18),
        ('ls', [100, 95, 93], 22),
        ('ww', [98, 98, 98], 20)]

ary = np.array(data, dtype='U2, 3int8, int8')

sum = 0
for i in data:
    sum = i[2]+sum
print(sum/3)

print(ary['f2'].mean())

The above code uses 2 methods to find the average age

Method 2: Set tuples through lists (not commonly used)

import numpy as np
import warnings
warnings.filterwarnings('ignore')

data = [('zs', [100, 90, 95], 18),
        ('ls', [100, 95, 93], 22),
        ('ww', [98, 98, 98], 20)]

# print(data)

ary = np.array(data)
# print(ary)

ary = np.array(data, dtype=[('name', 'str', 2),
                            ('score', 'int32', 3),
                            ('age', 'int32', 1)])
print(ary['score'].mean())
# 输出
# 20

Method 3: Set dtype through fixed keys of dictionary

import numpy as np
import warnings
warnings.filterwarnings('ignore')

data = [('zs', [100, 90, 95], 18),
        ('ls', [100, 95, 93], 22),
        ('ww', [98, 98, 98], 20)]

# print(data)

ary = np.array(data)
# print(ary)

ary = np.array(data, dtype={
    
    'names': ['name', 'score', 'age'], 'formats': ['U2', '3int32', 'int32']})

print(ary['age'])
# 输出
# [18 22 20]

5.3 datetime64

import numpy as np
import warnings
warnings.filterwarnings('ignore')

data = np.array(['2011', '2012-12-12', '2023-02-13 08:08:08'])

# 将字符串转成时间日期（精确到日）类型
pretty_data = data.astype("datetime64[D]")
print(pretty_data)

# 转成整形
res = pretty_data.astype('int64')  
print(res)  # 返回距1970年1月1日的天数

# 将字符串转成时间日期（精确到秒）类型
pretty_data = data.astype("datetime64[s]")
print(pretty_data)

# 转成整形
res = pretty_data.astype('int64')
print(res)  # 返回距1970年1月1日的秒数

# 输出

# ['2011-01-01' '2012-12-12' '2023-02-13']

# [14975 15686 19401]

# ['2011-01-01T00:00:00' '2012-12-12T00:00:00' '2023-02-13T08:08:08']

# [1293840000 1355270400 1676275688]

numpy's date format requirements are strict
The format of the string cannot be in the form of 2021-1-1 nor can it be in the form of 2021/01/01

6. ndarray array dimension operation

Container: Wine bottle Element: Wine
Assignment copy: Old wine in wine bottle
Shallow copy: Old wine in new bottle
Deep copy: New wine in new bottles

6.1 View dimension change (data sharing): reshape() and ravel()

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 9)
print(ary)
# 视图变维
bry = ary.reshape(2, 4)
print(bry)
print(ary)
ary[0] = 123
print("修改后的ary:", ary)
print("bry:", bry)
# 输出
# [1 2 3 4 5 6 7 8]
# [[1 2 3 4]
# [5 6 7 8]]
# [1 2 3 4 5 6 7 8]
# 修改后的ary: [123   2   3   4   5   6   7   8]
# bry: [[123   2   3   4]
# [  5   6   7   8]]

Only the shape has changed, the original data has been modified, and the data after the dimension change has changed accordingly. This is what is called数据共享
Although the shape of the data after the dimension change has changed, But it does not affect the data before the dimension change
ravelStretch the array (no matter how many dimensions) to 1 dimension

6.2 Assignment variable dimension (data independent)

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 9).reshape(2, 4)
print(ary)

bry = ary.flatten()
print(bry)

ary[0] = 666
print(ary)

print(bry)

# 输出
# [[1 2 3 4]

#  [5 6 7 8]]

# [1 2 3 4 5 6 7 8]

# [[666 666 666 666]
#  [  5   6   7   8]]

# [1 2 3 4 5 6 7 8]

6.3 In-place dimensionality change: directly change the dimensions of the original array without returning a new array

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 9)
ary.resize(2, 2, 2)
print(ary)

ary = np.arange(1, 9)
ary.resize(2, 2, 2)
print(ary)

7. ndarray array slicing operation

7.1 One-dimensional array slicing

数组对象切片的参数设置与列表切片参数类似
步长+：軘认切从端到端
步长-：默认切从尾到头
数组对像[起始位置：< /span>:1,…]终止位置:步长
默认步长

Insert image description here

7.2 Multidimensional array slicing

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 9)
ary.resize(3,3)
print(ary)
print(ary[:2])  # 前两行
print(ary[:2, :2])  # 前两行的前两列
print(ary[::2, ::2])  # 1 3行，1 3列

# 输出
# [[1 2 3]
#  [4 5 6]
#  [7 8 0]]
# [[1 2 3]
#  [4 5 6]]
# [[1 2]
#  [4 5]]
# [[1 3]
#  [7 0]]

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary =np.arange(1, 101).reshape(20, 5)
print(ary)
# 所有行不要最后一列
print("所有行不要最后一列")
print(ary[:, :-1])
# 所有行只要最后一列
print(ary[:, -1])

7.3 Mask operation of ndarray array

boolean mask

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 10)
mask = [True, False, True, True, False, True, True, True, False]
res = ary[mask]
print(res)

# 输出
# [1 3 4 6 7 8]

Boolean mask operation example: find numbers that are multiples of 3 within 100

import numpy as np
import warnings
warnings.filterwarnings('ignore')

ary = np.arange(1, 101)
print(ary[ary % 3 == 0])

Label mask: The index value in the mask array

import numpy as np
import warnings
warnings.filterwarnings('ignore')

car = np.array(['bwm', 'benzi', 'audi', 'hongqi'])
mask = [0, 2, 1, 3]
res = car[mask]
print(res)
mask = [0, 0, 0, 0, 0, 2, 1, 1, 1, 1, 1, 1, 3]
res = car[mask]
print(res)


# 输出
# ['bwm' 'audi' 'benzi' 'hongqi']
# ['bwm' 'bwm' 'bwm' 'bwm' 'bwm' 'audi' 'benzi' 'benzi' 'benzi' 'benzi' 'benzi' 'benzi' 'hongqi']

7.4 Combination and splitting of multidimensional arrays

stack and split

Vertical direction vstack vsplit
Horizontal direction hstack hsplit
Depth direction dstack dsplit

import numpy as np

ary = np.arange(1, 7).reshape(2, 3)
bry = np.arange(7, 13).reshape(2, 3)
res = np.dstack((ary, bry))
print(ary)
print(bry)
print(res)
print(res.shape)

print("-"*30)

ary, bry = np.dsplit(res, 2)
print(ary)
print(bry)

# 输出
[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]
[[[ 1  7]
  [ 2  8]
  [ 3  9]]

 [[ 4 10]
  [ 5 11]
  [ 6 12]]]
(2, 3, 2)
------------------------------
[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]]
[[[ 7]
  [ 8]
  [ 9]]

 [[10]
  [11]
  [12]]]

三维数组After splitting依然 is3维数组. If you want to change it into 2 dimensions, you can only use the method of changing dimensions.

concatenate

If the arrays to be combined are all two-dimensional arrays:
0: 垂直方向组合
1: 水平方向组合
If the arrays to be combined are They are all three-dimensional arrays:
0: 垂直方向组合
1: 水平方向组合
2: 深度方向组合
np.< a i=8>((a,b),=0) Proceed in a certain direction through the given array and the number of parts to be split. Split, the value of axis is the same as above np.split(c, 2, axis=0)concatenateaxis

Array combinations of different lengths

Fill first, then combine

Simple one-dimensional array combination scheme

a = np.arange(1,9)
b = np.arange(9,17)

#把两个数组摞在一起成两行
c = np.row_stack((a，b))
print(c)

#把两个数组组合在一起成两列
d = np.column_stack((a，b))
print(d)

Other properties of ndarray class

shape-dimension
dtype - element type
size - number of elements
ndim-dimension, len(shape)
itemsize - number of bytes of elements
nbytes - total number of bytes = size x itemsize
real - Array of real parts of complex array
imag - Array of imaginary part of complex array
T - transposed view of array object
flat - flat selector