pandas(2)

1.数据提取
frame = id name score
0 100 m1 66
1 102 m2 78
2 108 m3 90

-[提取某行/几行全部字段]:frame.iloc[0] / frame.iloc[0:2]
-[提取指定行的指定字段]:frame.iloc[0:2,0:-1]

2.删除数据
-[删除行数据]:frame.drop(行索引列表[id1,id2,...],axis=0)
-[删除列数据]:frame.drop(字段名称列表[attrName1,attrName2,...],axis=1)

3.获取符合条件的数据
-frame.loc[frame[attrName]==value]
frame.loc[(frame[attrName]==value),[attrName1,attrName2,...]](规定指定字段)
-frame.loc[frame[attrName].isin([value1,value2,...])]
frame.loc[(frame[attrName].isin([value1,value2,...])),[attrName1,attrName2,...]](规定指定字段)
-frame.loc[(frame[attName])!=value),[attrName1,attrName2,...]].sort_values(by=attrName)

4.数据汇总
-[按照某字段对其余所有列进行计数]:frame.groupby(attrName).count()
-[按照某字段对特定列进行计数]:frame.groupby(attrName)[attrName].count()
-[按照某字段汇总后对特定字段进行统计]:frame.groupby(attrName1)[attrName2].agg([len,np.mean])

5.数据统计
-[随机抽取指定记录数的数据(放回抽样)]:frame.sample(n=number)
-[指定概率下随机抽取指定记录数的数据(放回抽样)]:frame.sample(n=,weights=[...])
-[随机抽取指定记录数的数据(不放回抽样)]:frame.sample(n=,replace=False)
-[数据表描述性统计]:frame.describe().round(3)

6.其他统计量
-[计算某字段的标准差]:frame[attrName].std()
-[计算协方差]:frame.cov()

-[计算相关性]:frame.corr()

时间戳与标准格式时间转换

(1)时间戳转为标准格式时间

import time

nowtimestamp = int(time.time())

formattime = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(nowtimestamp))

(2)标准格式时间转为时间戳

import time

str = '2018-07-13 09:57:39'

timestamp = int(time.mktime(time.strptime(str,"%Y-%m-%d %H:%M:%S")))

猜你喜欢