numpy基础 Task01

常量

numpy.nan

表示空值

numpy.inf

表示正无穷大

numpy.pi

表示圆周率

numpy.e

表示自然常数

时间日期和时间增量

datetime64

在 numpy 中，我们很方便的将字符串转换成时间日期类型 datetime64（datetime 已被 python 包含的日期时间库所占用）。

从字符串创建 datetime64 类型时，默认情况下，numpy 会根据字符串自动选择对应的单位。

import numpy as np

a = np.datetime64('2020-03-01')
print(a,a.dtype)

b = np.datetime64('2020-03')
print(b,b.dtype)

2020-03-01 datetime64[D]
2020-03 datetime64[M]

从字符串创建datetime64类型时，默认情况下，numpy会根据字符串自动选择对应的单位

import numpy as np

a = np.datetime64('2020-03-01')
print(a,a.dtype)

b = np.datetime64('2020-03')
print(b,b.dtype)

2020-03-01 datetime64[D]
2020-03 datetime64[M]

从字符串创建datetime64类型时，可以强制指定使用的单位

import numpy as np

a = np.datetime64('2020-03','D')
print(a,a.dtype)

b = np.datetime64('2020-03','Y')
print(b,b.dtype)

2020-03-01 datetime64[D]
2020 datetime64[Y]

print(np.datetime64('2020-03') == np.datetime64('2020-03-01'))
print(np.datetime64('2020-03') == np.datetime64('2020-03-02'))

True
False

由上例可以看出，2019-03 和 2019-03-01 所表示的其实是同一个时间。事实上，如果两个 datetime64 对象具有不同的单位，它们可能仍然代表相同的时刻。并且从较大的单位（如月份）转换为较小的单位（如天数）是安全的。

从字符串创建datetime64 数组时，如果单位不统一，则一律转换为其中最小的单位

import numpy as np

a = np.array(['2020-03','2020-03-09','2020-03-08 20:00:00'],dtype = np.datetime64)
print(a,a.dtype)

[‘2020-03-01T00:00:00’ ‘2020-03-09T00:00:00’ ‘2020-03-08T20:00:00’] datetime64[s]

使用arange()创建datetime64数组，用于生成日期范围

import numpy as np

a = np.arange('2020-08-01','2020-08-10',dtype = np.datetime64)
print(a)

[‘2020-08-01’ ‘2020-08-02’ ‘2020-08-03’ ‘2020-08-04’ ‘2020-08-05’
‘2020-08-06’ ‘2020-08-07’ ‘2020-08-08’ ‘2020-08-09’]

datetime64 和 timedelta64 运算

timedelta64 表示两个 datetime64 之间的差。timedelta64 也是带单位的，并且和相减运算中的两个 datetime64 中的较小的单位保持一致。

import numpy as np

a = np.datetime64('2020-02-08') - np.datetime64('2020-02-07')
print(a,a.dtype)

b = np.datetime64('2020-03-08') - np.datetime64('2020-03-07 08:30')
print(b,b.dtype)

c = np.datetime64('2020-03') + np.timedelta64(20, 'D')
print(c,c.dtype)

1 days timedelta64[D]
930 minutes timedelta64[m]
2020-03-21 datetime64[D]

timedelta64的运算

import numpy as np 

a = np.timedelta64(1,'Y')
b = np.timedelta64(6,'M')
print(a+b)

18 months

numpy.datetime64 与 datetime.datetime 相互转换

import numpy as np
import datetime

dt = datetime.datetime(year=2020, month=6, day=1, hour=20, minute=5, second=30)
dt64 = np.datetime64(dt, 's')
print(dt64, dt64.dtype)

dt2 = dt64.astype(datetime.datetime)
print(dt2, type(dt2))

2020-06-01T20:05:30 datetime64[s]
2020-06-01 20:05:30 <class ‘datetime.datetime’>

datetime64 的应用

为了允许在只有一周中某些日子有效的上下文中使用日期时间，NumPy包含一组“busday”（工作日）功能。

numpy.busday_offset(dates, offsets, roll='raise', weekmask='1111100', holidays=None, busdaycal=None, out=None)

数组的创建

依据现有数据来创建 ndarray

通过array()函数进行创建

import numpy as np
# 创建一维数组
a = np.array([0,1,2,3,4])
print(a,type(a))

# 创建二维数组
c = np.array([[11, 12, 13, 14, 15],
              [16, 17, 18, 19, 20],
              [21, 22, 23, 24, 25],
              [26, 27, 28, 29, 30],
              [31, 32, 33, 34, 35]])
print(c,type(c))

[0 1 2 3 4] <class ‘numpy.ndarray’>
[[11 12 13 14 15]
[16 17 18 19 20]
[21 22 23 24 25]
[26 27 28 29 30]
[31 32 33 34 35]] <class ‘numpy.ndarray’>

通过asarray()函数进行创建

arrarray()和asarray()都可以将结构数据转化为ndarray,但是array()和asarray()主要区别就是当数据源是ndarray时，array()仍然会copy出一个副本，占用新的内存，但不改变dtype时asarray()不会。

import numpy as np

x = np.array([[1,1,1],[1,1,1],[1,1,1]])
y = np.array(x)
z = np.asarray(x)
w = np.asarray(x,dtype=np.int)
x[1][2] = 2
print(x,type(x),x.dtype)
print(y,type(y),y.dtype)
print(z,type(z),z.dtype)

[[1 1 1]
[1 1 2]
[1 1 1]] <class ‘numpy.ndarray’> int32
[[1 1 1]
[1 1 1]
[1 1 1]] <class ‘numpy.ndarray’> int32
[[1 1 1]
[1 1 2]
[1 1 1]] <class ‘numpy.ndarray’> int32

依据ones和zeros填充方式

在机器学习任务中经常做的一件事就是初始化参数，需要用常数值或者随机值来创建一个固定大小的矩阵。

零数组

zeros()函数：返回给定形状和类型的零数组

zeros_like()函数：返回与给定数组形状和类型相同的零数组

import numpy as np 

x = np.zeros([2,3])
print(x)
print('++++++++++')
x = np.array([[1,2,3],[4,5,6]])
y = np.zeros_like(x)
print(y)

[[0. 0. 0.]
[0 0 0]]
++++++++++
[[0 0 0]
[0 0 0]]

一数组

ones()函数：返回给定形状和类型的1数组。

ones_like()函数：返回与给定数组形状和类型相同的1数组。

import numpy as np 

x = np.ones([2, 3])
print(x)
print('++++++++++')
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.ones_like(x)
print(y)

[[1. 1. 1.]
[1. 1. 1.]]
++++++++++
[[1 1 1]
[1 1 1]]

空数组

empty()函数：返回一个空数组，数组元素为随机数。

empty_like函数：返回与给定数组具有相同形状和类型的新数组。

import numpy as np 

x = np.empty([3, 2])
print(x)
print('++++++++++')
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.empty_like(x)
print(y)

[[1. 0.]
[0. 1.]
[0. 0.]]
++++++++++
[[-369442848 621 0]
[ 0 131074 105]]

单位数组

eye()函数：返回一个对角线上为1，其它地方为零的单位数组。

identity()函数：返回一个方的单位数组。

import numpy as np

x = np.eye(3,2)
print(x)
print('++++++++++')
y = np.identity(4)
print(y)

[[1. 0.]
[0. 1.]
[0. 0.]]
++++++++++
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]

对角线数组

diag()函数：提取对角线或构造对角数组。

import numpy as np

x = np.arange(9).reshape((3, 3))
print(x)
print(np.diag(x))
print('++++++++++')
v = [1, 3, 5, 7]
x = np.diag(v)
print(x)

[[0 1 2]
[3 4 5]
[6 7 8]]
[0 4 8]
++++++++++
[[1 0 0 0]
[0 3 0 0]
[0 0 5 0]
[0 0 0 7]]

常数数组

full()函数：返回一个常数数组。

full_like()函数：返回与给定数组具有相同形状和类型的常数数组。

import numpy as np

x = np.full((2,), 7)
print(x)
print('++++++++++')
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.full_like(x, 7)
print(y)

[7 7]
++++++++++
[[7 7 7]
[7 7 7]]

利用数值范围来创建ndarray

arange()函数：返回给定间隔内的均匀间隔的值。

linspace()函数：返回指定间隔内的等间隔数字。

logspace()函数：返回数以对数刻度均匀分布。

numpy.random.rand() 返回一个由[0,1)内的随机数组成的数组。

import numpy as np

x = np.arange(5)
print(x)
print('++++++++++')
x = np.arange(3,7,2)
print(x)
print('++++++++++')
x = np.linspace(start=0,stop=2,num=9)
print(x)
print('++++++++++')
x = np.logspace(0,1,5)
print(np.around(x,2))
print('++++++++++')
x = [10 ** i for i in x]
print(np.around(x, 2))
print('++++++++++')
x = np.random.random([2, 3])
print(x)

[0 1 2 3 4]
++++++++++
[3 5]
++++++++++
[0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. ]
++++++++++
[ 1. 1.78 3.16 5.62 10. ]
++++++++++
[1.0000000e+01 6.0020000e+01 1.4530400e+03 4.2015859e+05 1.0000000e+10]
++++++++++
[[0.93865159 0.94697189 0.11264221]
[0.38264766 0.50437106 0.74726343]]

结构数组的创建

结构数组，首先要定义结构，然后利用np.array()来创建数组，其参数dtype为定义的结构

利用字典来定义结构

import numpy as np

personType = np.dtype({
    
    
    'names': ['name', 'age', 'weight'],
    'formats': ['U30', 'i8', 'f8']})

a = np.array([('Liming', 24, 63.9), ('Mike', 15, 67.), ('Jan', 34, 45.8)],
             dtype=personType)
print(personType, type(personType))
print('++++++++++')
print(a, type(a))

[(‘name’, ‘<U30’), (‘age’, ‘<i8’), (‘weight’, ‘<f8’)] <class ‘numpy.dtype’>
++++++++++
[(‘Liming’, 24, 63.9) (‘Mike’, 15, 67. ) (‘Jan’, 34, 45.8)] <class ‘numpy.ndarray’>

利用包含多个元组的列表来定义结构

import numpy as np

personType = np.dtype([('name', 'U30'), ('age', 'i8'), ('weight', 'f8')])
a = np.array([('Liming', 24, 63.9), ('Mike', 15, 67.), ('Jan', 34, 45.8)],
             dtype=personType)
print(a, type(a))

[(‘Liming’, 24, 63.9) (‘Mike’, 15, 67. ) (‘Jan’, 34, 45.8)] <class ‘numpy.ndarray’>

数组的属性

numpy.ndarray.ndim用于返回数组的维数（轴的个数）也称为秩，一维数组的秩为 1，二维数组的秩为 2，以此类推。

numpy.ndarray.shape表示数组的维度，返回一个元组，这个元组的长度就是维度的数目，即 ndim 属性(秩)。

numpy.ndarray.size数组中所有元素的总量，相当于数组的shape中所有元素的乘积，例如矩阵的元素总量为行与列的乘积。

numpy.ndarray.dtype ndarray 对象的元素类型。

numpy.ndarray.itemsize以字节的形式返回数组中每一个元素的大小。

import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6.0]])
print(a.shape)
print(a.dtype)
print(a.size)
print(a.ndim)
print(a.itemsize)

(2, 3)
float64
6
2
8

常量