Pandas中loc iloc ix 的详细使用

刚开始学习Python中的numpy、pandas时候，各种索引，切片，行列索引会弄得头昏眼花。其中还包括花式索引，布尔索引等。在这对其中一部分进行总结。

loc、iloc、ix方法的使用

loc：通过选取行（列）标签索引数据
iloc：通过选取行（列）位置编号索引数据
ix：既可以通过行（列）标签索引数据，也可以通过行（列）位置编号索引数据

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df=pd.DataFrame(np.arange(20).reshape(4,5),index=['ind0','ind1','ind2','ind3'],columns=['col0','col1','col2','col3','col4'])

In [4]: df
Out[4]: 
      col0  col1  col2  col3  col4
ind0     0     1     2     3     4
ind1     5     6     7     8     9
ind2    10    11    12    13    14
ind3    15    16    17    18    19

上面这里构造了一个4*5的DataFrame数据，同时构造了行标签和列标签，下面是使用.index()和.columns()方法查看数据。

In [5]: df.index
Out[5]: Index(['ind0', 'ind1', 'ind2', 'ind3'], dtype='object')

In [6]: df.columns
Out[6]: Index(['col0', 'col1', 'col2', 'col3', 'col4'], dtype='object')

loc()方法

loc只能通过选取行标签索引数据

In [8]: df.loc['ind0']
Out[8]: 
col0    0
col1    1
col2    2
col3    3
col4    4
Name: ind0, dtype: int32

In [9]: df.loc[0]
Traceback (most recent call last):
  ##报错的详细信息，我就不占篇幅展示了
TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>

1.先解释三种方法对行索引的使用

iloc()方法

iloc只能通过选取行位置编号索引数据

In [10]: df.iloc[0]
Out[10]: 
col0    0
col1    1
col2    2
col3    3
col4    4
Name: ind0, dtype: int32

In [11]: df.iloc['ind0']
Traceback (most recent call last):
##报错的详细信息，我就不占篇幅展示了
TypeError: cannot do positional indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [ind0] of <class 'str'>

ix()方法

ix 既可以通过行标签索引数据，也可以通过行位置编号索引数据，虽然显示即将过期，但是很好用。~~

In [12]: df.ix[0]
__main__:1: DeprecationWarning: 
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
Out[12]: 
col0    0
col1    1
col2    2
col3    3
col4    4
Name: ind0, dtype: int32

In [13]: df.ix['ind0']
Out[13]: 
col0    0
col1    1
col2    2
col3    3
col4    4
Name: ind0, dtype: int32

里面可以写行标签索引数据，也可以写行位置编号，还可以混合起来一起使用，下面会讲到。

2.三种方法对列索引的使用

loc、iloc、ix对于列的索引跟对行的索引是一样的，loc只能通过选取列标签索引数据，iloc只能通过选取列位置编号索引数据，ix 既可以通过行标签索引数据，也可以通过行位置编号索引数据，还可以两者混用。

In [14]: df.loc['ind0','col0']
Out[14]: 0

In [15]: df.loc['ind0',0]
Traceback (most recent call last):
##报错的详细信息，我就不占篇幅展示了
TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>

In [16]: 

In [16]: df.iloc[0,0]
Out[16]: 0

In [17]: df.iloc[0,'col0']
Traceback (most recent call last):
##报错的详细信息，我就不占篇幅展示了
ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types

In [18]: 

In [18]: df.ix[0,0]
Out[18]: 0
In [19]: df.ix[0,'col0']
Out[19]: 0

3.loc、iloc、ix对切片的使用

loc、iloc、ix对于切片的索引数据就两种情况，按照标签切片索引和按照位置编号切片索引

In [20]: df.loc['ind0':'ind3']
Out[20]: 
      col0  col1  col2  col3  col4
ind0     0     1     2     3     4
ind1     5     6     7     8     9
ind2    10    11    12    13    14
ind3    15    16    17    18    19

In [21]: df.iloc[0:3]
Out[21]: 
      col0  col1  col2  col3  col4
ind0     0     1     2     3     4
ind1     5     6     7     8     9
ind2    10    11    12    13    14

两者的区别是使用标签索引会将切片末端包含进去，通过位置编号索引不会切片末端包含进去。如上图所示。当然ix()方法使用起来也一样。

In [23]: df.ix['ind0':'ind3']
Out[23]: 
      col0  col1  col2  col3  col4
ind0     0     1     2     3     4
ind1     5     6     7     8     9
ind2    10    11    12    13    14
ind3    15    16    17    18    19

In [24]: df.ix[0:3]
Out[24]: 
      col0  col1  col2  col3  col4
ind0     0     1     2     3     4
ind1     5     6     7     8     9
ind2    10    11    12    13    14

个人建议现在使用ix()比较好用，当然loc(),iloc()方法有些代码中出现频繁，必须掌握！！

扫描二维码关注公众号，回复： 11519702 查看本文章