数据挖掘－Numpy简单使用（一）

numpy有时用np 表示（import numpy as np）

认识N维数组-ndarray属性

数组属性反映了数组本身固有的信息。

属性名字	属性解释
ndarray.shape	数组维度的元组
ndarray.ndim	数组维数
ndarray.size	数组中的元素数量
ndarray.itemsize	一个数组元素的长度（字节）
ndarray.dtype	数组元素的类型

dtype是numpy.dtype类型，对于数组来说都有哪些类型

名称	描述	简写
np.bool	用一个字节存储的布尔类型（True或False）	'b'
np.int8	一个字节大小，-128 至 127	'i'
np.int16	整数，-32768 至 32767	'i2'
np.int32	整数，-2 31 至 2 32 -1	'i4'
np.int64	整数，-2 63 至 2 63 - 1	'i8'
np.uint8	无符号整数，0 至 255	'u'
np.uint16	无符号整数，0 至 65535	'u2'
np.uint32	无符号整数，0 至 2 ** 32 - 1	'u4'
np.uint64	无符号整数，0 至 2 ** 64 - 1	'u8'
np.float16	半精度浮点数：16位，正负号1位，指数5位，精度10位	'f2'
np.float32	单精度浮点数：32位，正负号1位，指数8位，精度23位	'f4'
np.float64	双精度浮点数：64位，正负号1位，指数11位，精度52位	'f8'
np.complex64	复数，分别用两个32位浮点数表示实部和虚部	'c8'
np.complex128	复数，分别用两个64位浮点数表示实部和虚部	'c16'
np.object_	python对象	'O'
np.string_	字符串	'S'
np.unicode_	unicode类型	'U'

基本操作

生成数组

全０或全１数组

全０：

#　zeros(shape[, dtype, order]) zeros_like(a[, dtype, order, subok])
zero = np.zeros([2, 3])

"""
zero 生成结果
array([[0., 0., 0.],
       [0., 0., 0.]])
"""

全１：

# ones(shape[, dtype, order])
one = np.ones([2, 3])
"""
one结果
array([[1., 1., 1.],
       [1., 1., 1.]])
"""

从现有数组生成
1. array(object[, dtype, copy, order, subok, ndmin])
2. asarray(a[, dtype, order])－－（相当于拷贝中的浅拷贝当原始数组发生变化对应也发生变化）
3. copy(a[, order])

生成固定范围的数组

np.linspace (start, stop, num, endpoint, retstep, dtype)

np.linspace (start, stop, num, endpoint, retstep, dtype)
"""
start 序列的起始值
stop 序列的终止值，
如果endpoint为true，该值包含于序列中
num 要生成的等间隔样例数量，默认为50
endpoint 序列中是否包含stop值，默认为ture
retstep 如果为true，返回样例，
以及连续数字之间的步长
dtype 输出ndarray的数据类型
"""

# 生成等间隔的数组
np.linspace(0, 100, 10)
# 返回结果
array([  0.        ,  11.11111111,  22.22222222,  33.33333333,
        44.44444444,  55.55555556,  66.66666667,  77.77777778,
        88.88888889, 100.        ])

其他
1. numpy.arange(start,stop, step, dtype)
2. numpy.logspace(start,stop, num, endpoint, base, dtype)

生成随机数组
1. 均匀分布
  1. np.random.rand(d0, d1, ..., dn)
    
    返回[0.0，1.0)内的一组均匀分布的数。
  2. np.random.uniform(low=0.0, high=1.0, size=None)
    
    功能：从一个均匀分布[low,high)中随机采样，注意定义域是左闭右开，即包含low，不包含high.
    
    参数介绍:
    
    扫描二维码关注公众号，回复： 5738694 查看本文章
    
    low: 采样下界，float类型，默认值为0；
    
    high: 采样上界，float类型，默认值为1；
    
    size: 输出样本数目，为int或元组(tuple)类型，例如，size=(m,n,k), 则输出mnk个样本，缺省时输出1个值。
    
    返回值：ndarray类型，其形状和参数size中描述一致。
  3. np.random.randint(low, high=None, size=None, dtype='l')
    
    从一个均匀分布中随机采样，生成一个整数或N维整数数组，取数范围：若high不为None时，取[low,high)之间随机整数，否则取值[0,low)之间随机整数。
2. 正太分布
  1. np.random.randn(d0, d1, …, dn)
    
    功能：从标准正态分布中返回一个或多个样本值
  2. np.random.normal(loc=0.0, scale=1.0, size=None)
    
    loc：float
    
    此概率分布的均值（对应着整个分布的中心centre）
    
    scale：float
    
    此概率分布的标准差（对应于分布的宽度，scale越大越矮胖，scale越小，越瘦高）
    
    size：int or tuple of ints
    
    输出的shape，默认为None，只输出一个值
  3. np.random.standard_normal(size=None)
    
    返回指定形状的标准正态分布的数组。
数组的索引切片
1. 切片（类似列表切片）［组下标，别表切片］
3. 索引［组，下标］
数组去重
1. ndarray.unique（ｉｔｅｍｓ）　直接调用unique函数去重

数据挖掘－Numpy简单使用（一）

猜你喜欢