Pandas的数据结构

1、Series

Series是一种类似于一维数组的对象，由下面两个部分组成：

1.直接传入一个列表

s1 = Series([1,2,3,4])
s1

0    1
1    2
2    3
3    4
dtype: int64

查看series对象的属性:
    s1.index  # 索引
    s1.values

还可以带上index参数,表示里这个参数作为索引
s2 = Series(data=[1,2,3,4],index=list('abcd'))

2.用字典的方式去创建

Series({'a':1,'b':2,'c':3})
a    1
b    2
c    3
dtype: int64

显式索引：

s1 = Series(data=[150,150,150,300],index=list('语数外综'))
s1
语    150
数    150
外    150
综    300
dtype: int64

s1.loc[['语','外']]  # 同一个维度 取多个值 要用中括号括起来
s1.loc[['语','语']]
s1.loc[['综','语']]
s1.loc['语':'外']  # 文字索引 切片 开始位置和结束位置都能取到

s2 = Series(data=[1,2,3,4,5,6],index=list('abcdef'))
s2

s2.loc['b':'e':2]  # 也可以跳着取 2代表的是step
# s2.loc['e':'b':-1]  # 注意 如果想倒着取 前面切片的属性 也得是倒着的

隐式索引：

s2.iloc[0]
# 整数数组形式的索引 通过iloc同样可以使用
s2.iloc[[2,2,2,2,2]]
s2.iloc[[3,2,1,0]]

s2.loc['a':'c']
s2.iloc[0:3]  # 显示索引 切片的时候是 包括最后一个的 隐式索引 不包括最后一个

可以把Series看成一个定长的有序字典

可以通过shape,size,index,values等得到series的属性

s2.head()  # 如果不传参数 默认展示头5个内容
s2.tail()  # 查看最后的几个

Series中如果值是None，会被转成NaN。并且计算时会被当成0（ndarray不会）
可以使用pd.isnull()，pd.notnull()，或自带isnull(),notnull() 函数检测值为None或NaN的数据

另外 series对象有一个name属性可以用来区分不同的series

(1) 适用于numpy的数组运算也适用于Series

s2
a    1
b    2
c    3
d    4
e    5
f    6
dtype: int64

#s2+2
s2*2
a     2
b     4
c     6
d     8
e    10
f    12
dtype: int64

(2) Series之间的运算