pandas and the value calculated in operation dataFrame

dataFrame values in the presence of a matrix, accessing values need to bring the row index or column index.
1, dataFrame simple accessor methods

import pandas as pd

def createDataFrame():
    d = {
        'a':[1,2,3,4,5],
        'b':[6,2,3,6,0],
        'c':[4,2,3,6,7],
        'd':[5,3,2,4,5],
        'e':[6,7,4,5,8]
    }
    df = pd.DataFrame(d)
    #打印出dataFrame
    print(df)

if __name__ == '__main__':
    createDataFrame()

(1) The first three rows of data output

print(df.head(3))

Print results:

   a  b  c  d  e
0  1  6  4  5  6
1  2  2  2  3  7
2  3  3  3  2  4

(2) After the output data of the line 2:

  print(df.tail(2))

Print results:

   a  b  c  d  e
3  4  6  6  4  5
4  5  0  7  5  8

(3) third data output lines:

#loc通过标签来选择数据
print(df.loc[2])#这里的2表示行索引
#iloc通过位置来选择数据
print(df.iloc[2])#这里的2表示行位置
#打印结果
a    3
b    3
c    3
d    2
e    4

(4) outputs the second column data

print(df['b'])
#打印输出
0    6
1    2
2    3
3    6
4    0
#从dataFrame里取出的行或列的类型为Series,其可以看做为一个字典,对Series取值:
x = df.iloc[2]
print(x[2])
#打印输出
3
#可以自荐将Series转换为一个list
x = list(df.iloc[2])

(5) View name row

   print(df.index)

Printout:

RangeIndex(start=0, stop=5, step=1)

(6) View column name

   print(df.columns)

Printout:

Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

(6) View all data values

   print(df.values)

Printout:

[[1 6 4 5 6]
 [2 2 2 3 7]
 [3 3 3 2 4]
 [4 6 6 4 5]
 [5 0 7 5 8]]

(7) Check the number of ranks

  print(df.iloc[:,0].size) #行数
  print(df.columns.size) #列数

Printout:

5
5

2, the operation of complex dataFrame values (data dataFrame the filter)
to create a dataFrame example:

import numpy as np
import pandas as pd


def GroupbyDemo():
    df = pd.DataFrame({'key1': [1, 2, 1, 2, 1],
                       'key2': [10, 20, 30, 40, 50],
                       'data1': np.random.randn(5),
                       'data2': np.random.randn(5)})
    print(df)

if __name__ == '__main__':
    GroupbyDemo()

Print Results:

   key1  key2     data1     data2
0     1    10  0.510140 -0.272037
1     2    20  1.303937 -0.296393
2     1    30  0.984371  0.005988
3     2    40 -1.257891 -1.089489
4     1    50  0.129426 -1.011806

(1) selecting a column worth of data is greater than one:

print(df[df.key1>1])

Print results:

   key1  key2     data1     data2
1     2    20  1.006815 -1.191766
3     2    40  0.392499 -0.906492

(2) selecting a first column of the second row is greater than 1 and the data is greater than 30

    print(df[(df.key1>1) & (df.key2>30)])

Print results:

   key1  key2     data1     data2
3     2    40  0.681879  0.206709

(3) selecting a first column of a second column of data is greater than 1 or greater than 30

   print(df[(df.key1>1) | (df.key2>30)])

Print results:

   key1  key2     data1     data2
1     2    20 -2.454197  1.091813
3     2    40  0.481552  0.763660
4     1    50  1.639578  0.740787

3, operation of a function dataFrame
(1) transpose

print(df.T)

Output:

   0  1  2  3  4
a  1  2  3  4  5
b  6  2  3  6  0
c  4  2  3  6  7
d  5  3  2  4  5
e  6  7  4  5  8

Guess you like

Origin blog.csdn.net/weixin_34223655/article/details/90862820