How to select data in python

How to select the row and column of the data in python

To pick specific rows and columns of data in Python, you can use indexing or slicing.

Suppose you have a two-dimensional list (or Numpy array or Pandas DataFrame) called data,

Here's how to select row 3 and column 2:

# 选择第3行和第2列
third_row_second_column = data[2][1]

Here it 2means row 3, because the index in Python starts counting from 0, and 1it means column 2,

Because we are counting from left to right when traversing inside the list.

To select column 2 of all rows, you can use slicing:

# 选择所有行的第2列
second_column = [row[1] for row in data]

Here we've used a list comprehension to create a new list that contains the elements in column 2 of all rows.

Note that we use the number 1 instead of 2 because indices in Python start counting from 0.

Similarly, to select all columns in row 5, you can use the following code:

# 选择第5行的所有列
fifth_row = data[4]

This 4means row 5, since indexes in Python start counting from 0.

 

If the column has a name, how to select data according to the column name and the row 

If your dataset has column names, then you can select data by column name and row number.

In Python, Pandas is a powerful tool for conveniently manipulating data with column names.

Suppose you have a Pandas DataFrame named data which contains 4 columns named "A", "B", "C", "D",

Here's how to select row 3 and column named "B":

# 选择第3行和名为"B"的列
third_row_b_column = data.loc[2, "B"]

Functions are used here loc, which can select by labels (i.e. row and column names).

2represents row 3, since indexes start counting from 0, and "B"represents the column named "B".

To select the column named "C" for all rows, you can use the following code:

# 选择所有行的名为"C"的列
c_column = data["C"]

Here, the column name of the data frame is directly used as the dictionary key to access the column, and the returned result is a Series object, which contains elements named "C" of all rows.

If you want to select all columns in row 5, you can use the following code:

# 选择第5行的所有列
fifth_row = data.iloc[4]

Functions are used here iloc, which make selections based on integer positions. 4means row 5, since indexes start counting from 0.

What does data2[[“A”, “B”]]=data[[“A”, “B”]] mean in python

This line of code means to copy all the data in the columns named "A" and "B" in the Pandas DataFrame datadataset to data2the columns of the same name in the new DataFrame dataset.

Specifically, the one on the left data2[[“A”, "B"]]means data2to select the columns named "A" and "B" of the dataset. Then, the one on the right data[[“A”, "B"]]means datato select the columns named "A" and "B" of the data set. Finally, dataall data from the selected columns in the dataset is copied into data2the corresponding columns of the dataset.

Note that this operation only copies the data and does not create a new dataset. So if you change the value of a column data2in dataone of the datasets, the corresponding column in the other dataset will change accordingly.

Also, datathe code throws KeyErroran exception if there is no column named "A" or "B" in the dataset.

What does data2[2][[“A”, “B”]]=data[2][[“A”, “B”]] mean

This line of code means to datacopy to data2row 3 (ie index 2) of the new DataFrame dataset 2) in the column with the same name.

Specifically, the one on the left data2[2][[“A”, "B"]]means data2select the rows 3 of the dataset with columns named "A" and "B". Then, the one on the right data[2][[“A”, "B"]]says datato select the columns named "A" and "B" in row 3 of the dataset. Copies dataall data from the selected columns in the dataset to data2the corresponding columns in the dataset.

What does data3=data[2:5][1:4] mean?

dataThis line of code means to slice the Pandas DataFrame dataset , select the rows with indexes 2, 3, and 4 and the columns with indexes 1, 2, and 3, and then assign them to a new DataFrame dataset data3.

Specifically, [2:5]it means selecting from row 3 (index 2) to row 5 (index 4), but not including row 5; while it [1:4]means selecting from column 2 (index 1) to column 4 (index is 3), but does not include column 4. Therefore, a subset of 3 rows and 3 columns from the dataset data3will be included .data

Note that this operation does not modify the original dataset data, but creates a new dataset data3. Any data3changes to will not affect the original dataset data.

Also, if a slice operation selects rows or columns outside the bounds of the dataset, partial data is returned without raising an error.

What does data3=data[2:5;1:4] mean (use colons to separate slice operations.)

If you want to select rows with indices 2, 3, 4 and columns with indices 1, 2, 3 in a Pandas DataFrame datadataset ,

The following code can be used:

data3 = data.iloc[2:5, 1:4]

Here, ilocfunction is used to select based on integer position. 2:5Indicates selection from row 3 (index 2) to row 5 (index 4), but not including row 5; and 1:4selection from column 2 (index 1) to column 4 (index 3), But column 4 is not included. Therefore, a subset of 3 rows and 3 columns from the dataset data3will be included .data

Note that this operation does not modify the original dataset data, but creates a new dataset data3. Any data3changes to will not affect the original dataset data.

Also, if a slice operation selects rows or columns outside the bounds of the dataset, partial data is returned without raising an error.

 

Guess you like

Origin blog.csdn.net/qq_53011270/article/details/130710185