Article Directory
Pandas indexing operation
Index Object Index
1. Series and DataFrame Index object in the index are
Sample code:
print(type(ser_obj.index))
print(type(df_obj2.index))
print(df_obj2.index)
operation result:
<class 'pandas.indexes.range.RangeIndex'>
<class 'pandas.indexes.numeric.Int64Index'>
Int64Index([0, 1, 2, 3], dtype='int64')
2. The index object immutable, to ensure the security of data
Sample code:
# 索引对象不可变
df_obj2.index[0] = 2
operation result:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-23-7f40a356d7d1> in <module>()
1 # 索引对象不可变
----> 2 df_obj2.index[0] = 2
/Users/Power/anaconda/lib/python3.6/site-packages/pandas/indexes/base.py in __setitem__(self, key, value)
1402
1403 def __setitem__(self, key, value):
-> 1404 raise TypeError("Index does not support mutable operations")
1405
1406 def __getitem__(self, key):
TypeError: Index does not support mutable operations
Common species Index
- Index, Index
- Int64Index, integer index
- MultiIndex, the index level
- DatetimeIndex, timestamp type
Series Index
1. index index name specified row
Sample code:
ser_obj = pd.Series(range(5), index = ['a', 'b', 'c', 'd', 'e'])
print(ser_obj.head())
operation result:
a 0
b 1
c 2
d 3
e 4
dtype: int64
2. row index
ser_obj [ 'label'], ser_obj [pos]
Sample code:
# 行索引
print(ser_obj['b'])
print(ser_obj[2])
operation result:
1
2
3. Slice Index
ser_obj [2: 4], ser_obj 'Label1': '' label3]
Note that when the slicing operation by index name is included the termination of the index.
Sample code:
# 切片索引
print(ser_obj[1:3])
print(ser_obj['b':'d'])
operation result:
b 1
c 2
dtype: int64
b 1
c 2
d 3
dtype: int64
4. discontinuous index
ser_obj [[ '' Label1, Label2, '' label3 ']]
Sample code:
# 不连续索引
print(ser_obj[[0, 2, 4]])
print(ser_obj[['a', 'e']])
operation result:
a 0
c 2
e 4
dtype: int64
a 0
e 4
dtype: int64
5. Boolean Index
Sample code:
# 布尔索引
ser_bool = ser_obj > 2
print(ser_bool)
print(ser_obj[ser_bool])
print(ser_obj[ser_obj > 2])
operation result:
a False
b False
c False
d True
e True
dtype: bool
d 3
e 4
dtype: int64
d 3
e 4
dtype: int64
DataFrame Index
1. columns specified index name column
Sample code:
import numpy as np
df_obj = pd.DataFrame(np.random.randn(5,4), columns = ['a', 'b', 'c', 'd'])
print(df_obj.head())
operation result:
a b c d
0 -0.241678 0.621589 0.843546 -0.383105
1 -0.526918 -0.485325 1.124420 -0.653144
2 -1.074163 0.939324 -0.309822 -0.209149
3 -0.716816 1.844654 -2.123637 -1.323484
4 0.368212 -0.910324 0.064703 0.486016
[Image dump the chain fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-k9JqJxrj-1579951879137) (... / images / DataFrameIndex.png)]
2. column index
df_obj[[‘label’]]
Sample code:
# 列索引
print(df_obj['a']) # 返回Series类型
print(df_obj[[0]]) # 返回DataFrame类型
print(type(df_obj[[0]])) # 返回DataFrame类型
operation result:
0 -0.241678
1 -0.526918
2 -1.074163
3 -0.716816
4 0.368212
Name: a, dtype: float64
<class 'pandas.core.frame.DataFrame'>
3. discontinuous index
df_obj[[‘label1’, ‘label2’]]
Sample code:
# 不连续索引
print(df_obj[['a','c']])
print(df_obj[[1, 3]])
operation result:
a c
0 -0.241678 0.843546
1 -0.526918 1.124420
2 -1.074163 -0.309822
3 -0.716816 -2.123637
4 0.368212 0.064703
b d
0 0.621589 -0.383105
1 -0.485325 -0.653144
2 0.939324 -0.209149
3 1.844654 -1.323484
4 -0.910324 0.486016
Senior Index: tags, location and mix
Advanced indexing Pandas are three kinds
1. loc tag index
DataFrame not directly slice, slice can be done by loc
loc tag name is based on the index, which is our custom index name
Sample code:
# 标签索引 loc
# Series
print(ser_obj['b':'d'])
print(ser_obj.loc['b':'d'])
# DataFrame
print(df_obj['a'])
# 第一个参数索引行,第二个参数是列
print(df_obj.loc[0:2, 'a'])
operation result:
b 1
c 2
d 3
dtype: int64
b 1
c 2
d 3
dtype: int64
0 -0.241678
1 -0.526918
2 -1.074163
3 -0.716816
4 0.368212
Name: a, dtype: float64
0 -0.241678
1 -0.526918
2 -1.074163
Name: a, dtype: float64
2. iloc position index
Loc action and the same, but is indexed based on the index number
Sample code:
# 整型位置索引 iloc
# Series
print(ser_obj[1:3])
print(ser_obj.iloc[1:3])
# DataFrame
print(df_obj.iloc[0:2, 0]) # 注意和df_obj.loc[0:2, 'a']的区别
operation result:
b 1
c 2
dtype: int64
b 1
c 2
dtype: int64
0 -0.241678
1 -0.526918
Name: a, dtype: float64
3. ix mixing position index label
ix is both more comprehensive, you can either use an index number, they can use a custom index to be used depending on the circumstances,
If the index both numbers in English, so this approach is not recommended, and easily lead to confusion positioning.
Sample code:
# 混合索引 ix
# Series
print(ser_obj.ix[1:3])
print(ser_obj.ix['b':'c'])
# DataFrame
print(df_obj.loc[0:2, 'a'])
print(df_obj.ix[0:2, 0])
operation result:
b 1
c 2
dtype: int64
b 1
c 2
dtype: int64
0 -0.241678
1 -0.526918
2 -1.074163
Name: a, dtype: float64
note
DataFrame indexing operation, the operation may be regarded as the index of ndarray
Slice index tag is included at the end position