Project github address: bitcarmanlee easy-algorithm-interview-and-practice
welcome everyone to star, leave a message, and learn and progress together
1.loc usage
loc is based on the index of the row, you can select a specific row, and you can also select a specified column based on the column name.
iloc is based on the row/column position (position) to select
def select_test():
a = [i for i in range(10)]
b = [2*x + 0.1 for x in a]
data = {"x": a, "y": b}
tmp = pd.DataFrame(data, index=["r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10"])
print(tmp.index)
print(tmp.columns)
print()
return tmp
The output of the method is
Index(['r1', 'r2', 'r3', 'r4', 'r5', 'r6', 'r7', 'r8', 'r9', 'r10'], dtype='object')
Index(['x', 'y'], dtype='object')
Use the loc method to select the first row
tmp = select_test()
print(tmp.loc["r1"])
Output result
x 0.0
y 0.1
Name: r1, dtype: float64
Use the loc method to select the first three rows, and only select the x column:
print(tmp.loc[["r1", "r2", "r3"], "x"])
r1 0
r2 1
r3 2
Name: x, dtype: int64
If you use the loc[1] method, an error will be reported
TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [1] of <class 'int'>
2.iloc usage
Select the first five lines
print(tmp.iloc[0:5])
x y
r1 0 0.1
r2 1 2.1
r3 2 4.1
r4 3 6.1
r5 4 8.1
Select the second column of the first five rows (the index of the first column is 0):
print(tmp.iloc[0:5, 1:2])
y
r1 0.1
r2 2.1
r3 4.1
r4 6.1
r5 8.1
print(tmp.iloc[0:5, "x"])
The above line of code will report an error:
ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types
The reason is simple, iloc can only use the starting position of the ranks, not the rank names.
3.ix
In the old version, ix is a hybrid of loc and iloc, which supports both location selection and column name selection.
In the new version, this method has been deprecated. Personally, I think it should be abandoned. The inevitable consequence of API being too flexible is that the code is not standardized and the readability is poor.
4. Index quick selection
There is also a way to quickly select rows/columns
print(tmp[0:5])
print(tmp[['x', 'y']])
x y
r1 0 0.1
r2 1 2.1
r3 2 4.1
r4 3 6.1
r5 4 8.1
x y
r1 0 0.1
r2 1 2.1
r3 2 4.1
r4 3 6.1
r5 4 8.1
r6 5 10.1
r7 6 12.1
r8 7 14.1
r9 8 16.1
r10 9 18.1
Among them, the first line of code selects the first 5 rows of data, and the second line of code selects the x and y columns of data.
5.at/iat method
at can quickly select an element in the dataframe according to the row index and column name:
print(tmp.at["r3", "x"])
Output is
2
If you use the following code, an error will be reported
print(tmp.at[3, "x"])
ValueError: At based indexing on an non-integer index can only have non-integer indexers
Similar to iloc, iat also locates elements by position
print(tmp.iat[0, 0])
Output is
0