数据分析之numpy使用

使用numpy生成数字

生成的类型是ndarray类型

t1 = np.array([1,2,3,4,5])
print(t1,type(t1))        # 类型为ndarray

t2 = np.array(range(10))
print(t2)

t3 = np.arange(10)   # 相当于array+range
print(t3,t3.dtype)  # dtype 可以查看数组内的数据类型

t4 = np.arange(10,dtype="f2")   # 制定数据类型
print(t4.dtype)

t5 = np.array([random.random() for i in range(10)])  # 10个小数
print(t5)

t6 = np.round(t5,2)  # 取小数后两位
print(t6)

 
      
       
         
         
           [ 
           1  
           2  
           3  
           4  
           5 
           ] < 
           class  
           'numpy.ndarray' 
           > 
          
 
           [ 
           0  
           1  
           2  
           3  
           4  
           5  
           6  
           7  
           8  
           9 
           ] 
          
 
           [ 
           0  
           1  
           2  
           3  
           4  
           5  
           6  
           7  
           8  
           9 
           ] int32 
          
 
           float16 
          
 
           [ 
           0.71127883  
           0.16178949  
           0.57974356  
           0.92394061  
           0.29455775  
           0.44950361 
          
 
             
           0.30519271  
           0.23295048  
           0.24572958  
           0.85217598 
           ] 
          
 
           [ 
           0.71  
           0.16  
           0.58  
           0.92  
           0.29  
           0.45  
           0.31  
           0.23  
           0.25  
           0.85 
           ] 
          
 
       
 
      
    

numpy常见的数据类型

查看数组的形状(几行几列)

一维数组

a1 = np.arange(12)
print(a1)
a1.shape

[ 0  1  2  3  4  5  6  7  8  9 10 11]

二维数组

a2 = np.array([[1,2,3],[4,5,6]])
print(a2)
a2.shape

[[1 2 3]
 [4 5 6]]

三维数组

a3 = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(a3)
a3.shape

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

改变数组的形状

b1 = np.arange(12)
b1.reshape(3,4)   # 将原数组形状变成3行4列的二维数组

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

b2 = np.arange(24).reshape((2,3,4))   # 2表示块儿数 (3,4)表示每一块的形状
print(b2) 

b2.reshape(4,6)  # 将b2变形为4行6列的2维数组    reshape是有返回值的  不会改变b2原来的数据

# 将b2变形为1维数的两种方式
b2.flatten()
b2.reshape((24,))  


[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23])

数组的计算

数组和数字进行计算(广播机制)

c1 = np.arange(12)
print(c1)

# (广播机制) 当我们把数组与数字进行计算的时候 它会把计算的过程应用到数组的每一个数字 然后分别计算

c1+2   [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13]
c1*2   [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22]
c1/2   [0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5]
c1/0   [nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf]

数组与数组之间的计算(形状相同)

c2 = np.arange(24)
c3 = np.arange(100,124)
print(c2,c3)

# 当数组中的数据长度相同时
# 两个数组中的数据一一对应进行计算
c2+c3   [100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124,
       126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146]

c2*c3   [   0,  101,  204,  309,  416,  525,  636,  749,  864,  981, 1100,
       1221, 1344, 1469, 1596, 1725, 1856, 1989, 2124, 2261, 2400, 2541,
       3 2684, 2829]

c2/c3   [0.        , 0.00990099, 0.01960784, 0.02912621, 0.03846154,
       0.04761905, 0.05660377, 0.06542056, 0.07407407, 0.08256881,
       0.09090909, 0.0990991 , 0.10714286, 0.11504425, 0.12280702,
       0.13043478, 0.13793103, 0.14529915, 0.15254237, 0.15966387,
       0.16666667, 0.17355372, 0.18032787, 0.18699187]

数组和形状不一样的数组进行计算

# 当他们在某一维度形状一样时是可以进行计算的
n1 = np.arange(12).reshape((4,3))
n2 = np.arange(4).reshape((4,1))
print(n1)
print(n2)
n1+n2        # n1与n2行数相同

array([[ 0,  1,  2],
       [ 4,  5,  6],
       [ 8,  9, 10],
       [12, 13, 14]])

总结:

两个不同形状的数组只要在某一维度相同就是可以计算的
　　- 如果所有维度都不相同是不可以计算的

numpy读取数据

行列转换

方法一

np.loadtxt(frame,unpack=True)  # loadtxt中的unpack设置为True也是可以将从文件读取出来的数据进行行列转换的

方法二

s1.transpose()

方法三

s1.T

方法四

# 0代表x轴，1代表y轴
s1.swapaxes(1,0)  # 交换轴

numpy的索引和切片

z1 = np.arange(24).reshape(4,6)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

行操作

# 取行
print(z1[2])   # 中括号就是取行

# 连续取多行
print(z1[[1,2]])
print(z1[1:])

# 取不连续多行
print(z1[[1,3]])

列操作

# 取列
print(z1[:,1])

# 取连续多列
print(z1[:,3:])

# 取不连续多列
print(z1[:,[1,3,5]])

取行和列

# 取行和列的值  取第3行4列的值      这里注意我们在取值时用的都是索引,行和列都是从0开始, 而我们正常都是从1开始数行和列的
print(z1[2,3])


z2 = np.arange(100).reshape(10,10)
print(z2)
# 取多行和多列  取第3行到第6行  第2列到第5列的结果
print(z2[2:6,1:5])   # 取得是行和列交叉点得位置


# 取多个不相邻得值
# print(z2[[1,2],[2,4]])  # 分别取第2行的第3列  和 第3行的第5列的值   选出来的点就是(1,2)  (2,4)

print(z2[[6,7,8],[6,7,8]])  # 选出来的点是(6,6) (7,7) (8,8)

numpy中数值修改

重新赋值

# 取到值后重新赋值即可
res = np.arange(12).reshape(3,4)
# 将6修改为100
res[1,2]=100
print(res)

# 修改多个值
res[1:2]=[3,3,3,4]
print(res)

[[  0   1   2   3]
 [  4   5 100   7]
 [  8   9  10  11]]
[[ 0  1  2  3]
 [ 3  3  3  4]
 [ 8  9 10 11]]

根据范围取值

ret = np.arange(100).reshape(10,10)
print(ret)

# 根据范围取值
ret[ret<50] = 666
print(ret)


[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]

[[666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [ 50  51  52  53  54  55  56  57  58  59]
 [ 60  61  62  63  64  65  66  67  68  69]
 [ 70  71  72  73  74  75  76  77  78  79]
 [ 80  81  82  83  84  85  86  87  88  89]
 [ 90  91  92  93  94  95  96  97  98  99]]

三元运算

tt = np.arange(100).reshape(10,10)
np.where(tt<50,0,1)   # 所有小于50的替换成0,大于50的替换成1

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

numpy中的clip(裁剪)

w = np.arange(100).reshape(10,10)
w.clip(50,60)     # 将小于50的替换成50,大于60的替换成60

array([[50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60]])

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

nan注意点

axis=0 取的是列上的每一行数据

axis=1 取得是行上得每一列数据

什么是中值？

[1,2,3,4,5]    # 中值为3
[1,2,3,4,5,6]  # 中值为 (3+4)/2 = 3.5

numpy中常用得统计函数

numpy中的删除操作

使用numpy生成数字

生成的类型是ndarray类型

t1 = np.array([1,2,3,4,5])
print(t1,type(t1))        # 类型为ndarray

t2 = np.array(range(10))
print(t2)

t3 = np.arange(10)   # 相当于array+range
print(t3,t3.dtype)  # dtype 可以查看数组内的数据类型

t4 = np.arange(10,dtype="f2")   # 制定数据类型
print(t4.dtype)

t5 = np.array([random.random() for i in range(10)])  # 10个小数
print(t5)

t6 = np.round(t5,2)  # 取小数后两位
print(t6)

 
    
     
       
       
         [ 
         1  
         2  
         3  
         4  
         5 
         ] < 
         class  
         'numpy.ndarray' 
         > 
        
 
         [ 
         0  
         1  
         2  
         3  
         4  
         5  
         6  
         7  
         8  
         9 
         ] 
        
 
         [ 
         0  
         1  
         2  
         3  
         4  
         5  
         6  
         7  
         8  
         9 
         ] int32 
        
 
         float16 
        
 
         [ 
         0.71127883  
         0.16178949  
         0.57974356  
         0.92394061  
         0.29455775  
         0.44950361 
        
 
           
         0.30519271  
         0.23295048  
         0.24572958  
         0.85217598 
         ] 
        
 
         [ 
         0.71  
         0.16  
         0.58  
         0.92  
         0.29  
         0.45  
         0.31  
         0.23  
         0.25  
         0.85 
         ] 
        
 
     
 
    
  

numpy常见的数据类型

查看数组的形状(几行几列)

一维数组

a1 = np.arange(12)
print(a1)
a1.shape

[ 0  1  2  3  4  5  6  7  8  9 10 11]

二维数组

a2 = np.array([[1,2,3],[4,5,6]])
print(a2)
a2.shape

[[1 2 3]
 [4 5 6]]

三维数组

a3 = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(a3)
a3.shape

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

改变数组的形状

b1 = np.arange(12)
b1.reshape(3,4)   # 将原数组形状变成3行4列的二维数组

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

b2 = np.arange(24).reshape((2,3,4))   # 2表示块儿数 (3,4)表示每一块的形状
print(b2) 

b2.reshape(4,6)  # 将b2变形为4行6列的2维数组    reshape是有返回值的  不会改变b2原来的数据

# 将b2变形为1维数的两种方式
b2.flatten()
b2.reshape((24,))  


[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23])

数组的计算

数组和数字进行计算(广播机制)

c1 = np.arange(12)
print(c1)

# (广播机制) 当我们把数组与数字进行计算的时候 它会把计算的过程应用到数组的每一个数字 然后分别计算

c1+2   [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13]
c1*2   [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22]
c1/2   [0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5]
c1/0   [nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf]

数组与数组之间的计算(形状相同)

c2 = np.arange(24)
c3 = np.arange(100,124)
print(c2,c3)

# 当数组中的数据长度相同时
# 两个数组中的数据一一对应进行计算
c2+c3   [100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124,
       126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146]

c2*c3   [   0,  101,  204,  309,  416,  525,  636,  749,  864,  981, 1100,
       1221, 1344, 1469, 1596, 1725, 1856, 1989, 2124, 2261, 2400, 2541,
       3 2684, 2829]

c2/c3   [0.        , 0.00990099, 0.01960784, 0.02912621, 0.03846154,
       0.04761905, 0.05660377, 0.06542056, 0.07407407, 0.08256881,
       0.09090909, 0.0990991 , 0.10714286, 0.11504425, 0.12280702,
       0.13043478, 0.13793103, 0.14529915, 0.15254237, 0.15966387,
       0.16666667, 0.17355372, 0.18032787, 0.18699187]

数组和形状不一样的数组进行计算

# 当他们在某一维度形状一样时是可以进行计算的
n1 = np.arange(12).reshape((4,3))
n2 = np.arange(4).reshape((4,1))
print(n1)
print(n2)
n1+n2        # n1与n2行数相同

array([[ 0,  1,  2],
       [ 4,  5,  6],
       [ 8,  9, 10],
       [12, 13, 14]])

总结:

两个不同形状的数组只要在某一维度相同就是可以计算的
　　- 如果所有维度都不相同是不可以计算的

numpy读取数据

行列转换

方法一

np.loadtxt(frame,unpack=True)  # loadtxt中的unpack设置为True也是可以将从文件读取出来的数据进行行列转换的

方法二

s1.transpose()

方法三

s1.T

方法四

# 0代表x轴，1代表y轴
s1.swapaxes(1,0)  # 交换轴

numpy的索引和切片

z1 = np.arange(24).reshape(4,6)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

行操作

# 取行
print(z1[2])   # 中括号就是取行

# 连续取多行
print(z1[[1,2]])
print(z1[1:])

# 取不连续多行
print(z1[[1,3]])

列操作

# 取列
print(z1[:,1])

# 取连续多列
print(z1[:,3:])

# 取不连续多列
print(z1[:,[1,3,5]])

取行和列

# 取行和列的值  取第3行4列的值      这里注意我们在取值时用的都是索引,行和列都是从0开始, 而我们正常都是从1开始数行和列的
print(z1[2,3])


z2 = np.arange(100).reshape(10,10)
print(z2)
# 取多行和多列  取第3行到第6行  第2列到第5列的结果
print(z2[2:6,1:5])   # 取得是行和列交叉点得位置


# 取多个不相邻得值
# print(z2[[1,2],[2,4]])  # 分别取第2行的第3列  和 第3行的第5列的值   选出来的点就是(1,2)  (2,4)

print(z2[[6,7,8],[6,7,8]])  # 选出来的点是(6,6) (7,7) (8,8)

numpy中数值修改

重新赋值

# 取到值后重新赋值即可
res = np.arange(12).reshape(3,4)
# 将6修改为100
res[1,2]=100
print(res)

# 修改多个值
res[1:2]=[3,3,3,4]
print(res)

[[  0   1   2   3]
 [  4   5 100   7]
 [  8   9  10  11]]
[[ 0  1  2  3]
 [ 3  3  3  4]
 [ 8  9 10 11]]

根据范围取值

ret = np.arange(100).reshape(10,10)
print(ret)

# 根据范围取值
ret[ret<50] = 666
print(ret)


[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]

[[666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666 666 666 666]
 [ 50  51  52  53  54  55  56  57  58  59]
 [ 60  61  62  63  64  65  66  67  68  69]
 [ 70  71  72  73  74  75  76  77  78  79]
 [ 80  81  82  83  84  85  86  87  88  89]
 [ 90  91  92  93  94  95  96  97  98  99]]

三元运算

tt = np.arange(100).reshape(10,10)
np.where(tt<50,0,1)   # 所有小于50的替换成0,大于50的替换成1

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

numpy中的clip(裁剪)

w = np.arange(100).reshape(10,10)
w.clip(50,60)     # 将小于50的替换成50,大于60的替换成60

array([[50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50, 50, 50, 50, 50, 50],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60],
       [60, 60, 60, 60, 60, 60, 60, 60, 60, 60]])

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

nan注意点

axis=0 取的是列上的每一行数据

axis=1 取得是行上得每一列数据

什么是中值？

[1,2,3,4,5]    # 中值为3
[1,2,3,4,5,6]  # 中值为 (3+4)/2 = 3.5

numpy学习数据分析之numpy使用

数据分析之numpy使用

使用numpy生成数字

查看数组的形状(几行几列)

改变数组的形状

数组的计算

numpy读取数据

行列转换

numpy的索引和切片

numpy中数值修改

重新赋值

三元运算

numpy中的clip(裁剪)

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

numpy中常用得统计函数

numpy中的删除操作

使用numpy生成数字

查看数组的形状(几行几列)

改变数组的形状

数组的计算

numpy读取数据

行列转换

numpy的索引和切片

numpy中数值修改

重新赋值

三元运算

numpy中的clip(裁剪)

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

numpy中常用得统计函数

numpy中的删除操作

猜你喜欢

numpy学习 数据分析之numpy使用

数据分析之numpy使用

使用numpy生成数字

查看数组的形状(几行几列)

改变数组的形状

数组的计算

numpy读取数据

行列转换

numpy的索引和切片

numpy中数值修改

重新赋值

三元运算

numpy中的clip(裁剪)

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

numpy中常用得统计函数

numpy中的删除操作

使用numpy生成数字

查看数组的形状(几行几列)

改变数组的形状

数组的计算

numpy读取数据

行列转换

numpy的索引和切片

numpy中数值修改

重新赋值

三元运算

numpy中的clip(裁剪)

数组的拼接

数组的行列转换

numpy生成随机数

numpy中的nan和inf

numpy中常用得统计函数

numpy中的删除操作

猜你喜欢

numpy学习数据分析之numpy使用