Development tool: jupyter notebook (Python3)
series index operation
import numpy as np
import pandas as pd
A Serie object whose value is 6 floating-point numbers generated by random randn, with index values A, B, C, D, E, F
ser1 = pd.Series(np.random.randn(6),index=['A','B','C','D','E','F'])
ser1
A -1.917343
B -0.557361
C 0.508091
D 1.489047
E -0.596619
F -0.457463
dtype: float64
1. Use the position index method to get the index value C
ser1.iloc[2]
0.5080909750172036
2. Use name to access index C
ser1.loc['C']
0.5080909750172036
3. Use the position index to access index values C to E
ser1.iloc[2:5]
C 0.508091
D 1.489047
E -0.596619
dtype: float64
4. Use the name index to access index values C to E
ser1.loc['C':'E']
C 0.508091
D 1.489047
E -0.596619
dtype: float64
5. Use location index to access index values A, B, E, F
ser1.iloc[[0,1,4,5]]
A -1.917343
B -0.557361
E -0.596619
F -0.457463
dtype: float64
6. Use the name to access the index values A, B, E, F
ser1.loc[['A','B','E','F']]
A -1.917343
B -0.557361
E -0.596619
F -0.457463
dtype: float64
DataFrame index operation
A dataframe object whose value is a random integer, its row index is A, B, C, D, E, F, and the column index is a, b, c, d, e, f
df1 = pd.DataFrame(np.random.randint(0,9,(6,6)),index=['A','B','C','D','E','F'],columns=['a','b','c','d','e','f'])
df1
|
a |
b |
c |
d |
e |
f |
A |
3 |
8 |
3 |
4 |
3 |
5 |
B |
0 |
7 |
2 |
4 |
4 |
1 |
C |
0 |
7 |
3 |
4 |
0 |
8 |
D |
0 |
8 |
6 |
2 |
4 |
6 |
E |
5 |
4 |
8 |
6 |
7 |
4 |
F |
3 |
5 |
1 |
8 |
0 |
1 |
1. Find the data whose column index is f
df1.loc[:,'f']
A 5
B 1
C 8
D 6
E 4
F 1
Name: f, dtype: int32
2. Use the name index to find the data whose column index is a
df1.loc[:,'a']
A 3
B 0
C 0
D 0
E 5
F 3
Name: a, dtype: int32
3. Use the position index to find the data whose column index is b
df1.iloc[:,1]
A 8
B 7
C 7
D 8
E 4
F 5
Name: b, dtype: int32
4. Find the data whose row index is A
df1.loc['A']
a 3
b 8
c 3
d 4
e 3
f 5
Name: A, dtype: int32
5. Use the position index to find the data whose row index is F
df1.iloc[5]
a 3
b 5
c 1
d 8
e 0
f 1
Name: F, dtype: int32
6. Use the position index to find the data whose row index is B to E and column index is a to e
df1.iloc[1:5,0:5]
|
a |
b |
c |
d |
e |
B |
0 |
7 |
2 |
4 |
4 |
C |
0 |
7 |
3 |
4 |
0 |
D |
0 |
8 |
6 |
2 |
4 |
E |
5 |
4 |
8 |
6 |
7 |
7. Use the name index to find the data whose row index is B, D and column index is c, f
df1.loc[['B','D'],['c','f']]
8. Find the data whose row index is C, E, F and column element is b to f
df1.loc['C':'F','b':'f']
|
b |
c |
d |
e |
f |
C |
7 |
3 |
4 |
0 |
8 |
D |
8 |
6 |
2 |
4 |
6 |
E |
4 |
8 |
6 |
7 |
4 |
F |
5 |
1 |
8 |
0 |
1 |
Boolean index operation
1. A Series object, random integer, range 0-10, find out the data greater than 1
ser2 = pd.Series(np.random.randint(0,11,10))
ser2
0 3
1 4
2 10
3 3
4 7
5 3
6 7
7 6
8 8
9 0
dtype: int32
ser3 =ser2>1
ser2[ser3]
0 3
1 4
2 10
3 3
4 7
5 3
6 7
7 6
8 8
dtype: int32
2. A Dataframe object, random integer, range 0-10, find data greater than 3
df2 = pd.DataFrame(np.random.randint(0,11,(3,4)))
df2
|
0 |
1 |
2 |
3 |
0 |
0 |
6 |
10 |
10 |
1 |
0 |
0 |
9 |
4 |
2 |
5 |
0 |
6 |
1 |
df3 = df2>3
df2[df3]
|
0 |
1 |
2 |
3 |
0 |
NaN |
6.0 |
10 |
10.0 |
1 |
NaN |
NaN |
9 |
4.0 |
2 |
5.0 |
NaN |
6 |
NaN |