Detailed explanation of python pandas library to read specified row or column data in excel/csv

By reading the table, you can find that Pandas provides very rich data reading and writing methods. The following article mainly introduces the relevant information about python using pandas library to read the specified row or column data in excel/csv. Friends who need it You can refer to the following

introduction

The essential! ! ! ! Use the loc function to find.

Without further ado, let’s demonstrate directly:

There is the following table named try.xlsx:

1. Query according to index

Condition: The first imported data must have index

Or add it yourself, the method is simple, add index_col directly when reading the excel file

Code example:

import pandas as pd #Import pandas library
 
excel_file = './try.xlsx' #Import excel data
data = pd.read_excel(excel_file, index_col='姓名')        
#This index_col is the index, you can choose any field as the index index, read the data
print(data.loc['Li Si'])
The print result is

Department B

Salary 6600

Name: Li Si, dtype: object (note: index)

2. Know the data in which row to find the desired data

If there is an employee whose salary data is empty in our table, how can we find the data we want?

code show as below:

for i in data.columns:
    for j in range(len(data)):
        if (data[i].isnull())[j]:
            bumen = data.iloc[j, [0]] #Find out the department where the missing value is located
            data[i][j] = charuzhi(bumen)

The principle is very simple, first retrieve all the data, and then we can use the iloc function in pandas. In the above iloc[j, [2]], j is the specific location, and [0] is the column where the data you want to get is located.

3. Find the specified row data according to the conditional query

For example, to find the names and salaries of all members of department A or those whose salary is less than 3000:

code show as below:

"""Query a row of data based on conditions"""
import pandas as pd #Import pandas library
 
excel_file = './try.xlsx' #import file
data = pd.read_excel(excel_file) #Read in data
 
print(data.loc[data['department'] == 'A', ['name', 'salary']]) #The department is A, print the name and salary
print(data.loc[data['salary'] < 3000, ['name','salary']]) #Find people with salary less than 3000

The result is as follows:

To generate an excel file or a csv file from these data independently:

Add the following code

"""Export to excel or csv file"""
#single condition
dataframe_1 = data.loc[data['department'] == 'A', ['name', 'salary']]
#single condition
dataframe_2 = data.loc[data['salary'] < 3000, ['name', 'salary']]
#Multi-condition
dataframe_3 = data.loc[(data['department'] == 'A')&(data['salary'] < 3000), ['name', 'salary']]
#Export to excel
dataframe_1.to_excel('dataframe_1.xlsx')
dataframe_2.to_excel('dataframe_2.xlsx')

4. Find the specified column

data['columns'] #columns is the field name you need
#Note that the columns of this column cannot be the name of the index
#If you want to print the index, then data.index
data.columns #same as above

Libraries used in the whole process above:

pandas,xlrd , openpyxl

5. Find out the specified row and the specified column

The main use is the function iloc

data.iloc[:,:2] #That is, all rows, the data of the first two columns

Before the comma is the row, after the comma is the range of the column, which is easy to understand

6. Find eligible data within the specified range

data.iloc[:10,:][data.salary>6000]

In this way, you can find the information of all the people whose salary is greater than 6000 in the first 11 rows.

So far, this article about the python pandas library reading the specified row or column data in excel/csv is introduced here.

Guess you like

Origin blog.csdn.net/yaxuan88521/article/details/123518635