By reading the table, you can find that Pandas provides very rich data reading and writing methods. The following article mainly introduces the relevant information about python using pandas library to read the specified row or column data in excel/csv. Friends who need it You can refer to the following |
introduction
The essential! ! ! ! Use the loc function to find.
Without further ado, let’s demonstrate directly:
There is the following table named try.xlsx:
1. Query according to index
Condition: The first imported data must have index
Or add it yourself, the method is simple, add index_col directly when reading the excel file
Code example:
import pandas as pd #Import pandas library excel_file = './try.xlsx' #Import excel data data = pd.read_excel(excel_file, index_col='姓名') #This index_col is the index, you can choose any field as the index index, read the data print(data.loc['Li Si']) The print result is
Department B
Salary 6600
Name: Li Si, dtype: object (note: index)
2. Know the data in which row to find the desired data
If there is an employee whose salary data is empty in our table, how can we find the data we want?
code show as below:
for i in data.columns: for j in range(len(data)): if (data[i].isnull())[j]: bumen = data.iloc[j, [0]] #Find out the department where the missing value is located data[i][j] = charuzhi(bumen)
The principle is very simple, first retrieve all the data, and then we can use the iloc function in pandas. In the above iloc[j, [2]], j is the specific location, and [0] is the column where the data you want to get is located.
3. Find the specified row data according to the conditional query
For example, to find the names and salaries of all members of department A or those whose salary is less than 3000:
code show as below:
"""Query a row of data based on conditions""" import pandas as pd #Import pandas library excel_file = './try.xlsx' #import file data = pd.read_excel(excel_file) #Read in data print(data.loc[data['department'] == 'A', ['name', 'salary']]) #The department is A, print the name and salary print(data.loc[data['salary'] < 3000, ['name','salary']]) #Find people with salary less than 3000
The result is as follows:
To generate an excel file or a csv file from these data independently:
Add the following code
"""Export to excel or csv file""" #single condition dataframe_1 = data.loc[data['department'] == 'A', ['name', 'salary']] #single condition dataframe_2 = data.loc[data['salary'] < 3000, ['name', 'salary']] #Multi-condition dataframe_3 = data.loc[(data['department'] == 'A')&(data['salary'] < 3000), ['name', 'salary']] #Export to excel dataframe_1.to_excel('dataframe_1.xlsx') dataframe_2.to_excel('dataframe_2.xlsx')
4. Find the specified column
data['columns'] #columns is the field name you need #Note that the columns of this column cannot be the name of the index #If you want to print the index, then data.index data.columns #same as above
Libraries used in the whole process above:
pandas,xlrd , openpyxl
5. Find out the specified row and the specified column
The main use is the function iloc
data.iloc[:,:2] #That is, all rows, the data of the first two columns
Before the comma is the row, after the comma is the range of the column, which is easy to understand
6. Find eligible data within the specified range
data.iloc[:10,:][data.salary>6000]
In this way, you can find the information of all the people whose salary is greater than 6000 in the first 11 rows.
So far, this article about the python pandas library reading the specified row or column data in excel/csv is introduced here.