初识数据分析之numpy

numpy -- Numerical Python 数值的python

numpy弥补了作为通用编程语言的python在数值计算方面能力弱，速度慢的能力不足

numpy VS Python测试代码

import datetime as dt
import numpy as np # np是约定俗成的
n = 100000
start = dt.datetime.now()
A, B = [], []
for i in range(n):
    A.append(i ** 2)
    B.append(i ** 3)
C = []
for a, b in zip(A, B):
    C.append(a + b)
print((dt.datetime.now() - start).microseconds)
start = dt.datetime.now()
A, B = np.arange(n) ** 2, np.arange(n) ** 3
C = A + B
print((dt.datetime.now() - start).microseconds)

C:\Users\lubaba\PycharmProjects\lu\venv\Scripts\python.exe C:/data/data-science/day01/code/vector.py
519341

numpy.ndarray是一个表示多维数组的类，其中包括：

实际的数据，描述实际数据的元数据

大部分针对数组的操作其实就是对原数据的访问与修改，而并不涉及实际数据借以提高性能

例如：

实际数据：1， 2， 3， 4， 5， 6， 7， 8， 9

元数据：3*3， 1*9， 9*1

numpy数组数据类型一般数据类型的元素是一致的通过牺牲灵活性换取优越的性能

numpy数组中元素可以通过下标访问

ndarray.dtype属性表示元素的数据类型

ndarray.shape属性表示数组的维度，以tuple的形式表示多维

示例代码：

import numpy as np
a = np.arange(1, 3)
print(a.shape, a, sep='\n')
b = np.array([
    [1, 2, 3],
    [4, 5, 6]])
print(b.shape, b, sep='\n')
c = np.array([
    [np.arange(1, 5),
     np.arange(5, 9),
     np.arange(9, 13)],
    [np.arange(13, 17),
     np.arange(17, 21),
     np.arange(21, 25)]])
print(c.shape, c, sep='\n')

输出结果：

(2,)

[1 2]

(2, 3)
[[1 2 3]
[4 5 6]]
(2, 3, 4)
[[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]

[[13 14 15 16]
[17 18 19 20]

[21 22 23 24]]]

数组索引

数组对象[页索引][行索引][列索引]

数组对象[页索引，行索引，列索引]

import numpy as np
a = np.array([
   [ [1, 2],
    [3, 4]],
    [[5, 6],
     [7, 8]]
])
# print(a)
# print(a[0])
# print(a[0][0])
print(a[0][0][0])
for i in range(a.shape[0]):
    for j in range(a.shape[1]):
        for k in range(a.shape[2]):
            print(a[i, j, k])

数据类型
1)所有的类型都是类
type(1) -> <class 'int'>
2)类型也是对象，类型对象也有类型，类型对象的类型是type
type(type(1)) -> <class 'type'>
type(type(type(1))) -> <class 'type'>
3)python的数据类型
bool/int/float/complex/str/tuple/list/dict/set
4)numpy的数据类型
bool_
int8/int16/int32/int64
uint8/uint16/uint32/uint64
float16/float32/float64

complex64/complex128

示例：

import numpy as np
a = False
print(a, type(a))
b = 1
print(b, type(b))
c = 2.3
print(c, type(c))
d = 4 + 5j
print(d, type(d))
e = 'Hello, World!'
print(e, type(e))
f = ('florence', 20)
print(f, type(f))
g = [6, 7]
print(g, type(g))
h = {'name': 'florence', 'age': 20}
print(h, type(h))
i = {'florence', 'edward'}
print(i, type(i))


class Student():

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __str__(self):
        return '{} {}'.format(
            self.__class__.__name__,
            self.__dict__)


j = Student('florence', 20)
print(j, type(j))
Teacher = np.dtype([('name', np.str_, 40),
                    ('age', np.uint8)])
k = np.array([('edward', 40)], dtype=Teacher)
print(k, type(k))
print(k[0], type(k[0]))
l = 10
print(l)  # 10
print(type(l))  # <class 'int'>
print(type(type(l)))  # <class 'type'>
print(type(type(type(l))))  # <class 'type'>
print(bool)
print(int)
print(float)
print(complex)
print(str)
print(tuple)
print(list)
print(dict)
print(set)
print(np.bool_)
print(np.int8)
print(np.int16)
print(np.int32)
print(np.int64)
print(np.uint8)
print(np.uint16)
print(np.uint32)
print(np.uint64)
print(np.float16)
print(np.float32)
print(np.float64)
print(np.complex64)
print(np.complex128)
print(np.str_)
print(Teacher.type)
print(Student)

通过dtype指定数组元素的数据类型

数据对象 = np.array([初值表], dtype=元素类型)
1)numpy类型
..., dtype=np.int8
2)泛化类型
..., dtype=np.integer
3)python类型
..., dtype=int
4)类型名字符串
..., dtype='int8'
5)类型名编码串
..., dtype='i1'
6)变长类型+长度
..., dtype=(np.str_, 14)
7)定长类型+维度
..., dtype=(np.int8, 3)
8)由逗号分割的多个类型名编码串
..., dtype='U14,i1'
9)[(名称, 类型, 长度/维度), ...]

..., dtype=[('name', np.str_, 14), ('age', np.int8)]

10)(源类型, 目标类型)

切片：

数组[第1维起始:终止:步长, 第2维起始:终止:步长, ...]

import numpy as np
a = np.arange(1, 10)
print(a)  # 1 2 3 4 5 6 7 8 9
print(a[:3])  # 1 2 3
print(a[3:6])  # 4 5 6
print(a[6:])  # 7 8 9
print(a[::-1])  # 9 8 7 6 5 4 3 2 1

import numpy as np
b = np.arange(1, 25).reshape(2, 3, 4)
# print(b) # 两页，三行，四列
# print(b[:, 0, 0])
# print(b[0, :, :])
# print(b[0, ...])
# print(b[0, 1, ::2])
# print(b[..., 1])
# print(b[0, ::-1, ::-1])
print(b[-1, 1:, 2:])

初识数据分析之numpy

numpy VS Python测试代码

猜你喜欢