14 common operations for processing Excel tables in Python

Table of contents

1. Install dependent libraries

2. Import library

3. Read Excel file

4. Write Excel file

5. Create a worksheet

6. Access worksheets

7. Read cell data

8. Write cell data

9. Get the number of rows and columns

10. Filter data

11. Sort data

12. Add a new row

13. Delete row or column

14. Computing Summary Statistics

Summarize


Whether you are a data analyst, financial specialist or researcher, Excel is one of the essential tools in your daily work. With the powerful functions of Python, the efficiency and flexibility of Excel data processing can be greatly improved. Next, let's explore these commonly used Excel operations together to inject more convenience and efficiency into the workflow!

 

1. Install dependent libraries

Use the `pip` command on the command line to install the `pandas` and `openpyxl` libraries, which are used to process Excel and read/write Excel files, respectively.


   pip install pandas openpyxl

2. Import library

Import the `pandas` and `openpyxl` libraries in the Python script.


   import pandas as pd
   from openpyxl import Workbook, load_workbook

3. Read Excel file

Use the `read_excel()` function to read data from the Excel file, which returns a DataFrame object that contains the data in the Excel file.


   data = pd.read_excel('filename.xlsx')

   Note that `filename.xlsx` is the name of the Excel file.

4. Write Excel file

Write data to an Excel file using the `to_excel()` function, which writes the data in a DataFrame object to the specified Excel file.


   data.to_excel('new_filename.xlsx', index=False)

   `index=False` means not to include index columns.

5. Create a worksheet

Create a new worksheet using the `create_sheet()` function.

   workbook = Workbook()
   worksheet = workbook.create_sheet('Sheet1')

   In this example, we have created a new sheet called 'Sheet1'.

6. Access worksheets

Use the `active` attribute or the `get_sheet_by_name()` function to access an existing sheet.

 

 worksheet = workbook.active
   # 或
   worksheet = workbook.get_sheet_by_name('Sheet1')

   The `active` attribute accesses the active sheet, while the `get_sheet_by_name()` function accesses the sheet with the specified name.

7. Read cell data

Use the `cell()` method to get the value of a specific cell, you need to provide the row number and column number.


   cell_value = worksheet.cell(row=1, column=1).value

   In this example, we read the cell data in the first row and first column.

8. Write cell data

To write a value to a specific cell using the `cell()` method, the row and column numbers are also required.

   
   worksheet.cell(row=1, column=1, value='Hello')

   In this example, the string 'Hello' is written to the cell in the first row and first column.

9. Get the number of rows and columns

Use the `shape` property to get the number of rows and columns of the data table.

 num_rows = data.shape[0]
   num_cols = data.shape[1]

   The `shape` attribute returns a tuple containing the number of rows and columns.

10. Filter data

Filter data using conditional filter statements, for example, based on a column whose value is greater than a certain value.


    filtered_data = data[data['Column'] > 10]

    In this example, we filter the data with 'Column' greater than 10.

11. Sort data

Use the `sort_values()` function to sort the data by the specified columns.


    sorted_data = data.sort_values(by='Column')

    In this example, we sort the data in ascending order by the column 'Column'.

12. Add a new row

Use the `append()` function to add new rows of data to the DataFrame object.
   

new_data = pd.DataFrame({'A': [1], 'B': [2], 'C': [3]})
    data = data.append(new_data, ignore_index=True)

    In this example, we added a new row containing columns 'A', 'B' and 'C'.

13. Delete row or column

Use the `drop()` function to drop specific rows or columns.

 data = data.drop(index=0) # 删除第一行
 data = data.drop(columns=['Column1', 'Column2']) # 删除指定列

    In this example, we delete the first row and columns named 'Column1' and 'Column2'.

14. Computing Summary Statistics

Use the `describe()` function to calculate basic statistics about the data, such as mean, standard deviation, etc.


    summary_stats = data.describe()

    In this example, we calculated basic statistics on the data.

These are common operations when working with Excel using Python. Depending on your specific needs, you can select one or more of these operations to process and manipulate Excel files. Hope to help you!

Summarize

From reading and writing Excel files, creating and accessing worksheets, to reading and writing cell data, to data filtering, sorting and summary statistics, these operations cover the key steps in the data processing process. Using Python to process Excel can not only improve work efficiency, but also provide more flexibility and customization options for data processing.

At the same time, it should be noted that this is just the tip of the iceberg in Excel processing. Python has more powerful functions and libraries to explore in processing Excel, such as xlrd, xlwt, xlsxwriter, etc. If these skills and tools can be used flexibly according to actual needs in the work, the efficiency and quality of data processing will be greatly improved.

Guess you like

Origin blog.csdn.net/weixin_43856625/article/details/132311647