For convenience, I recorded more than 20 modules related to python automation in one breath.

As we all know, the more powerful part of python is that it has too many third-party processing modules.

Basically, it is a state where you can get what you want, and such python modules are still increasing, which is dazzling!

Today, I roughly sorted out the commonly used python automation-related modules, hoping to bring some help to my friends in automation, and to facilitate the search for records in the future.

The first is the processing of common office documents, such as Excel, Word, Ppt, Pdf, etc. If these operations can be automated, you can imagine how happy it is!

1. python office automation module

Excel automation

xlwings: xlwings is currently the most versatile module with high operating efficiency, that is, a module with a high comprehensive score.

openpyxl: openpyxl is more suitable for processing some data for cell format, including formulas, pictures, comments and so on.

xlrd: xlrd is python's extended tool for Excel, which can only read data but not write data.

xlwt: The nature of xlwt is the same as xlrd, the difference is that xlwt can only write but not read.

xlutils: xlutils is used for Excel data processing, filtering and other operations. It needs to be used together with xlrd and xlwt.

xlsxwriter: xlsxwriter provides the same data type as Excel and has good data compatibility.

Pandas: Pandas can be said to be the most simple and easy-to-use python module in Excel data processing. It is often used together with numpy data analysis and visualization charts.

Marmir: Marmir can convert the input Python data structure into an electronic form, and can complete the processing of the form with minimal configuration.

Word automation

python-docx: python-docx is a third-party library that uses python to read and write word files, and has a supporting official API.

textract: textract can extract information from various documents in the form of text.

FdfAutomation

PyPDF2: PyPDF2 is a three-party module specially used for python to operate Pdf documents. It can easily complete operations such as reading and writing, encryption and decryption, and watermarking of Pdf files.

ReportLab: ReportLab is a powerful open source engine developed by python, which can create complex Pdf documents or vector graphics.

PDFminer: PDFminer is more suitable for obtaining and analyzing text data, and can obtain accurate PDF information of a certain line on a certain page.

Fpt automation

python-pptx: python-pptx is a Python library for creating and updating PowerPoint (.pptx) files, typically generating custom presentation-ready project status reports from database content.

2. Python edge automation module

win32com: win32com mainly provides support for Python to call the underlying components of Windwos Office, and can only be used to support Windows systems.

unoconv: unoconv is a command line tool that can be used to batch convert documents or create Pdf, Word and other operations.

Tablib: Tablib is a format-independent tabular dataset library that supports tagging/filtering and seamless format import/export.

SnowNLP: SnowNLP is a relatively simple natural language processing module, which can often be used in conjunction with the jieba Chinese word segmentation module to achieve better results.

TextBlob: TextBlob is an open source text processing module that can perform many natural language processing tasks. It is not friendly because it can only support English language. It is recommended to use jieba to be more reliable.

TextGrocery: TextGrocery is a short text classification tool based on the SVM algorithm. It has built-in word segmentation processing of the jieba module, making text classification easy.

NumPy: NumPy (Numerical Python) is an open source numerical computing extension of Python, which can be used to process and store large matrix data, which is much more efficient than python's built-in list list.

Wonderful past

Python sentiment analysis: based on jieba's word segmentation and snownlp's sentiment analysis!

If there is a terminal tool dedicated to python, it must be him!

Python self-made file decompression gadget, supports 7z/zip/rar three formats at the same time!

Guess you like

Origin blog.csdn.net/chengxuyuan_110/article/details/128924547