Difference between loc, iloc and direct slice in Pandas

I have been using pandas recently, and I have not been able to figure out the difference between several slicing methods. I took a look at it today.

0. Use the row index of the Series or the column name of the Dataframe directly as an attribute to index.

Such as:

s.index_name

df.column_name

However, the name of this method index may conflict with the built-in methods, such as min, max, etc., so it may be invalid. Also, in the new version, this indexing method cannot be used as an lvalue.

1. df[] direct index

  • The direct index indexes the column, and the content in the square slogan is generally the column index name. It can also accept a list of column names to accept multiple column names.
 
 df['A']
df[['A', 'B']]

If you want to swap two columns, it is wrong to use this method directly:

df.loc[:,['B', 'A']] = df[['A', 'B']]

This is because pandas defaults to matching column names when assigning values, and there is actually no difference between AB and BA. If you want to swap two columns, you should use the values ​​of the two columns AB as rvalues, so that there is no column index name.

df.loc[:,['B', 'A']] = df[['A', 'B']].values
  • Index the slice object, index the row, because it is more common sense to do so
df=pd.DataFrame(np.arange(16).reshape((4,4)),index=list(range(4)),columns=['a','b','c','d'])

df
Out[4]: 
    a   b   c   d
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15


 

df[0:1]
Out[6]: 
   a  b  c  d
0  0  1  2  3

Here, if it is a Series, you can use a separate number to index; if it is a Dataframe, you can't, you need to use Python's slice object to index.

2. loc, label-based index

Since pandas deals with table objects with labels, it is necessary to design a set of label-based indexing methods, which is loc

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325087169&siteId=291194637