[Data Analysis] study notes day12 indexing operation target index Index + Series + index + index + DataFrame advanced indexing Pandas: The label, the position and mixing +1 Series and the index are DataFrame In

Pandas indexing operation

Index Object Index

1. Series and DataFrame Index object in the index are

Sample code:

print(type(ser_obj.index))
print(type(df_obj2.index))

print(df_obj2.index)

operation result:

<class 'pandas.indexes.range.RangeIndex'>
<class 'pandas.indexes.numeric.Int64Index'>
Int64Index([0, 1, 2, 3], dtype='int64')

2. The index object immutable, to ensure the security of data

Sample code:

# 索引对象不可变
df_obj2.index[0] = 2

operation result:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-7f40a356d7d1> in <module>()
      1 # 索引对象不可变
----> 2 df_obj2.index[0] = 2

/Users/Power/anaconda/lib/python3.6/site-packages/pandas/indexes/base.py in __setitem__(self, key, value)
   1402 
   1403     def __setitem__(self, key, value):
-> 1404         raise TypeError("Index does not support mutable operations")
   1405 
   1406     def __getitem__(self, key):

TypeError: Index does not support mutable operations

Common species Index

  • Index, Index
  • Int64Index, integer index
  • MultiIndex, the index level
  • DatetimeIndex, timestamp type

Series Index

1. index index name specified row

Sample code:

ser_obj = pd.Series(range(5), index = ['a', 'b', 'c', 'd', 'e'])
print(ser_obj.head())

operation result:

a    0
b    1
c    2
d    3
e    4
dtype: int64

2. row index

ser_obj [ 'label'], ser_obj [pos]

Sample code:

# 行索引
print(ser_obj['b'])
print(ser_obj[2])

operation result:

1
2

3. Slice Index

ser_obj [2: 4], ser_obj 'Label1': '' label3]

Note that when the slicing operation by index name is included the termination of the index.

Sample code:

# 切片索引
print(ser_obj[1:3])
print(ser_obj['b':'d'])

operation result:

b    1
c    2
dtype: int64
b    1
c    2
d    3
dtype: int64

4. discontinuous index

ser_obj [[ '' Label1, Label2, '' label3 ']]

Sample code:

# 不连续索引
print(ser_obj[[0, 2, 4]])
print(ser_obj[['a', 'e']])

operation result:

a    0
c    2
e    4
dtype: int64
a    0
e    4
dtype: int64

5. Boolean Index

Sample code:

# 布尔索引
ser_bool = ser_obj > 2
print(ser_bool)
print(ser_obj[ser_bool])

print(ser_obj[ser_obj > 2])

operation result:

a    False
b    False
c    False
d     True
e     True
dtype: bool
d    3
e    4
dtype: int64
d    3
e    4
dtype: int64

DataFrame Index

1. columns specified index name column

Sample code:

import numpy as np

df_obj = pd.DataFrame(np.random.randn(5,4), columns = ['a', 'b', 'c', 'd'])
print(df_obj.head())

operation result:

          a         b         c         d
0 -0.241678  0.621589  0.843546 -0.383105
1 -0.526918 -0.485325  1.124420 -0.653144
2 -1.074163  0.939324 -0.309822 -0.209149
3 -0.716816  1.844654 -2.123637 -1.323484
4  0.368212 -0.910324  0.064703  0.486016

[Image dump the chain fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-k9JqJxrj-1579951879137) (... / images / DataFrameIndex.png)]

2. column index

df_obj[[‘label’]]

Sample code:

# 列索引
print(df_obj['a']) # 返回Series类型
print(df_obj[[0]]) # 返回DataFrame类型
print(type(df_obj[[0]])) # 返回DataFrame类型

operation result:

0   -0.241678
1   -0.526918
2   -1.074163
3   -0.716816
4    0.368212
Name: a, dtype: float64
<class 'pandas.core.frame.DataFrame'>

3. discontinuous index

df_obj[[‘label1’, ‘label2’]]

Sample code:

# 不连续索引
print(df_obj[['a','c']])
print(df_obj[[1, 3]])

operation result:

          a         c
0 -0.241678  0.843546
1 -0.526918  1.124420
2 -1.074163 -0.309822
3 -0.716816 -2.123637
4  0.368212  0.064703
          b         d
0  0.621589 -0.383105
1 -0.485325 -0.653144
2  0.939324 -0.209149
3  1.844654 -1.323484
4 -0.910324  0.486016

Senior Index: tags, location and mix

Advanced indexing Pandas are three kinds

1. loc tag index

DataFrame not directly slice, slice can be done by loc

loc tag name is based on the index, which is our custom index name

Sample code:

# 标签索引 loc
# Series
print(ser_obj['b':'d'])
print(ser_obj.loc['b':'d'])

# DataFrame
print(df_obj['a'])

# 第一个参数索引行,第二个参数是列
print(df_obj.loc[0:2, 'a'])

operation result:

b    1
c    2
d    3
dtype: int64
b    1
c    2
d    3
dtype: int64

0   -0.241678
1   -0.526918
2   -1.074163
3   -0.716816
4    0.368212
Name: a, dtype: float64
0   -0.241678
1   -0.526918
2   -1.074163
Name: a, dtype: float64

2. iloc position index

Loc action and the same, but is indexed based on the index number

Sample code:

# 整型位置索引 iloc
# Series
print(ser_obj[1:3])
print(ser_obj.iloc[1:3])

# DataFrame
print(df_obj.iloc[0:2, 0]) # 注意和df_obj.loc[0:2, 'a']的区别

operation result:

b    1
c    2
dtype: int64
b    1
c    2
dtype: int64

0   -0.241678
1   -0.526918
Name: a, dtype: float64

3. ix mixing position index label

ix is ​​both more comprehensive, you can either use an index number, they can use a custom index to be used depending on the circumstances,

If the index both numbers in English, so this approach is not recommended, and easily lead to confusion positioning.

Sample code:

# 混合索引 ix
# Series
print(ser_obj.ix[1:3])
print(ser_obj.ix['b':'c'])

# DataFrame
print(df_obj.loc[0:2, 'a'])
print(df_obj.ix[0:2, 0])

operation result:

b    1
c    2
dtype: int64
b    1
c    2
dtype: int64

0   -0.241678
1   -0.526918
2   -1.074163
Name: a, dtype: float64

note

DataFrame indexing operation, the operation may be regarded as the index of ndarray

Slice index tag is included at the end position

Published 176 original articles · won praise 56 · views 10000 +

Guess you like

Origin blog.csdn.net/qq_35456045/article/details/104084420