dataFrame values in the presence of a matrix, accessing values need to bring the row index or column index.
1, dataFrame simple accessor methods
import pandas as pd
def createDataFrame():
d = {
'a':[1,2,3,4,5],
'b':[6,2,3,6,0],
'c':[4,2,3,6,7],
'd':[5,3,2,4,5],
'e':[6,7,4,5,8]
}
df = pd.DataFrame(d)
#打印出dataFrame
print(df)
if __name__ == '__main__':
createDataFrame()
(1) The first three rows of data output
print(df.head(3))
Print results:
a b c d e
0 1 6 4 5 6
1 2 2 2 3 7
2 3 3 3 2 4
(2) After the output data of the line 2:
print(df.tail(2))
Print results:
a b c d e
3 4 6 6 4 5
4 5 0 7 5 8
(3) third data output lines:
#loc通过标签来选择数据
print(df.loc[2])#这里的2表示行索引
#iloc通过位置来选择数据
print(df.iloc[2])#这里的2表示行位置
#打印结果
a 3
b 3
c 3
d 2
e 4
(4) outputs the second column data
print(df['b'])
#打印输出
0 6
1 2
2 3
3 6
4 0
#从dataFrame里取出的行或列的类型为Series,其可以看做为一个字典,对Series取值:
x = df.iloc[2]
print(x[2])
#打印输出
3
#可以自荐将Series转换为一个list
x = list(df.iloc[2])
(5) View name row
print(df.index)
Printout:
RangeIndex(start=0, stop=5, step=1)
(6) View column name
print(df.columns)
Printout:
Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
(6) View all data values
print(df.values)
Printout:
[[1 6 4 5 6]
[2 2 2 3 7]
[3 3 3 2 4]
[4 6 6 4 5]
[5 0 7 5 8]]
(7) Check the number of ranks
print(df.iloc[:,0].size) #行数
print(df.columns.size) #列数
Printout:
5
5
2, the operation of complex dataFrame values (data dataFrame the filter)
to create a dataFrame example:
import numpy as np
import pandas as pd
def GroupbyDemo():
df = pd.DataFrame({'key1': [1, 2, 1, 2, 1],
'key2': [10, 20, 30, 40, 50],
'data1': np.random.randn(5),
'data2': np.random.randn(5)})
print(df)
if __name__ == '__main__':
GroupbyDemo()
Print Results:
key1 key2 data1 data2
0 1 10 0.510140 -0.272037
1 2 20 1.303937 -0.296393
2 1 30 0.984371 0.005988
3 2 40 -1.257891 -1.089489
4 1 50 0.129426 -1.011806
(1) selecting a column worth of data is greater than one:
print(df[df.key1>1])
Print results:
key1 key2 data1 data2
1 2 20 1.006815 -1.191766
3 2 40 0.392499 -0.906492
(2) selecting a first column of the second row is greater than 1 and the data is greater than 30
print(df[(df.key1>1) & (df.key2>30)])
Print results:
key1 key2 data1 data2
3 2 40 0.681879 0.206709
(3) selecting a first column of a second column of data is greater than 1 or greater than 30
print(df[(df.key1>1) | (df.key2>30)])
Print results:
key1 key2 data1 data2
1 2 20 -2.454197 1.091813
3 2 40 0.481552 0.763660
4 1 50 1.639578 0.740787
3, operation of a function dataFrame
(1) transpose
print(df.T)
Output:
0 1 2 3 4
a 1 2 3 4 5
b 6 2 3 6 0
c 4 2 3 6 7
d 5 3 2 4 5
e 6 7 4 5 8