Which Excel library is best for Python?

As a Python programmer with a short life, how to operate Excel gracefully? In fact, Python provides as many as seven libraries for operating Excel. Which one is better and more convenient to use? First of all, let us grasp the characteristics of different libraries as a whole: xlrd is a library for reading data and formatting information from Excel files, supporting .xls and .xlsx files. Official documentation: http://xlrd.readthedocs.io/en/latest/xlwt is a library for writing data and formatting information to old Excel files (like .xls). Official documentation: https://xlwt.readthedocs.io/en/latest/xlutils is a library for processing Excel files, which depends on xlrd and xlwt. It only supports operations on .xls files. Official documentation: http://xlutils.readthedocs.io/en/latest/xlwings Simple and powerful, easy to use, can replace VBA. xlwings can support .xls reading and reading and writing of .xlsx files. Official documentation: http://docs.xlwings.org/en/stable/index.htmlXlsxWriter is a module used to write .xlsx file format, but cannot be used to read and modify Excel files. Official documentation: https://xlsxwriter.readthedocs.io/openpyxl is a library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files. Official documentation: https://openpyxl.readthedocs.io/en/stable/pandas is a powerful module for data processing and analysis, and sometimes can be used to automate Excel processing, official documentation: http://pandas.pydata.org / In addition, there is win32com. Everyone knows that it is inseparable from the windows system through the name. This library exists in pywin32 and is a library for reading, writing and processing Excel files. But my computer is a Mac, so I won't do the expansion. Official documentation: http://pythonexcels. com/python-excel-mini-cookbook/DataNitro? Strictly speaking, it is an Excel plug-in, and the installation needs to be downloaded separately from the official website. Also only supports windows system. Official document: https://datanitro.com/ You can also refer to the specific content: http://www.python-excel.org environment installation

insert image description here

The seven modules installed are all non-standard libraries, so they all need to be installed on the command line with pip/pip3: pip/pip3 install xlrdpip/pip3 install xlwtpip/pip3 install xlutilspip/pip3 install xlwingspip/pip3 install XlsxWriterpip/pip3 install openpyxlpip/pip3 Install pandas prompt: xlutils only supports xls files, that is, versions below 2003; after successful installation of xlwings, if the error message "ImportError: no module named win32api" is reported when running, please install the pypiwin32 or pywin32 package; the import of the module import module is the same as the previous import of other Like modules, use import to import. If the name is relatively long, you can also use as to create an alias. import xlrdimport xlwtimport xlwings as xwimport xlsxwriterimport openpyxlimport pandas as pdxlutils The module is a bridge between xlrd and xlwt. The core function is to copy a copy of the .xls object read into the memory through xlrd, and then modify the copied object through xlwt. The content of the xls table. xlutils can copy and convert xlrd Book object to xlwt Workbook object. For specific use, the copy submodule in the module is usually imported: import xlutils.copy document operation Due to the different design modes, there are certain differences in the basic functions of creating files, modifying files, saving files, etc. in different libraries, such as xlsxwriter does not support opening or modifying existing files, xlwings does not support naming new files, etc., but the analysis found that xlwings and openpyxl are the two libraries that support the most excel operations. In particular, let me explain the xlutils library. The functions of xlrd, xlwt, and xlutils have limitations, but the three complement each other and cover Excel files, especially . xls file operations. xlwt can generate .xls files, xlrd can read existing .xls files, and xlutils connects the two modules of xlrd and xlwt so that users can read and write a .xls file at the same time. Simply put, xlrd is responsible for reading, xlwt is responsible for writing, and xlutils is responsible for providing assistance and connection

insert image description here

Performance comparison The most basic writing and reading tests were done on several libraries, and different libraries were used to add and read 5000 rows * 800 columns of data operations to obtain the time spent, and the average value was obtained by repeated operations. In addition, the results will definitely vary under different computer configurations and environments, and the data is for reference only.

insert image description here

Although openpyxl has powerful functions for operating Excel, its read and write performance is too bad, especially when writing large tables, it will take up a lot of memory. After enabling the read_only and write_only modes, its performance will be greatly improved, especially for reading performance. making it nearly time-consuming. Pandas regards Excel as a container for data reading and writing, and serves its powerful data analysis. Therefore, the performance of reading and writing is quite satisfactory, but its compatibility with Excel files is the best. It supports reading and writing .xls, .xlsx files, and supports Read-only single sheet in the table. The library that also supports this function is xlrd, but xlrd only supports reading, not writing, and its performance is not outstanding. It needs to cooperate with xlutils to perform Excel operations. xlsxwriter has a single function and is generally used to create .xlsx files with moderate writing performance. All things considered, xlwings performs best, as its name suggests, xlwings——Make Excel Fly! Through the above analysis, I believe that everyone has a simple understanding of several libraries. You can choose the appropriate Python-Excel module according to your own needs and production environment, and attach some common code: xlwings basic code import xlwings as xw

#Connect to excel
workbook = xw.Book('path of your excel file')#Connect to excel file
#Connect to the specified cell
data_range = workbook.sheets('Sheet1').range('A1')
#Write data
data_range.value = ['a','b','c']
#Save
workbook.save() xlsxwriter basic code import xlsxwriter as xw
#New excel
workbook = xw.Workbook('path to your excel file')
# New workbook
worksheet = workbook.add_worksheet()
#write data
worksheet.wirte('A1','a')
#close and save
workbook.close() xlutils basic code import xlrd #read data
import xlwt #write data
import xlutils.copy #operation excel

Read data through xlrd

#Open excel file
workbook = xlrd.open_workbook('path of your excel file')
#Get form
worksheet = workbook.sheet_by_index(0)
#Read data
data = worksheet.cell_value(0,0)

Write data through xlwt

#New excel
wb = xlwt.Workbook()
#Add workbook
sh = wb.add_sheet('Sheet1')
#Write data
sh.write(0,0,'abc')
#Save the file
wb.save('myexcel. xls')

#Open excel file
book = xlrd.open_workbook('path of your excel file')
#Copy a
new_book = copy(book)
#Get workbook
worksheet = new_book.getsheet(0)
#Write data
worksheet.write( 0,0,'mydata')
#Save
new_book.save() openpyxl basic code import openpyxl

create a new file

workbook = openpyxl.Workbook()

write to file

sheet = workbook.activesheet[‘A1’]=‘A1’

save document

workbook.save('your excel save path')

Guess you like

Origin blog.csdn.net/weixin_43214644/article/details/127815790