Python's xhtml2pdf: Detailed explanation of HTML to PDF tool example

        This article introduces XHTML2PDF, a powerful HTML-to-PDF tool in Python, and provides detailed examples to help readers get started quickly. Through this article, readers will learn how to install and configure XHTML2PDF, and how to use the tool to convert HTML files into high-quality PDF documents.

        In modern application development, we often need to convert HTML content into PDF documents. This requirement is very common in scenarios such as printing e-commerce receipts, generating reports, and exporting e-books. There are many powerful tool libraries in Python that can be used to achieve HTML to PDF conversion, and XHTML2PDF is one of the very popular choices. This article will introduce how to use XHTML2PDF in detail, and provide sample codes to help readers get started quickly.

1. Install the XHTML2PDF library

Before using XHTML2PDF, we first need to install this library. It can be installed by using pip, executing the following command on the command line:

$ pip install xhtml2pdf

2. Import XHTML2PDF library

After the installation is complete, we need to import the XHTML2PDF library in the Python script in order to use its functions. The import syntax is as follows:

from xhtml2pdf import pisa

3. Convert HTML to PDF

Next, we will learn how to convert HTML files to PDF documents using XHTML2PDF. Here is a simple sample code:

from io import BytesIO
from xhtml2pdf import pisa

def convert_html_to_pdf(html_string, output_path):
    pdf_file = open(output_path, "wb")
    pisa_status = pisa.CreatePDF(html_string, dest=pdf_file)
    pdf_file.close()

    if pisa_status.err:
        print(f"Error occurred while converting HTML to PDF: {pisa_status.err}")
    else:
        print("HTML converted to PDF successfully!")

# 读取HTML文件内容
with open("input.html", "r") as file:
    html_content = file.read()

# 调用转换函数
convert_html_to_pdf(html_content, "output.pdf")

In the above code, we define a convert_html_to_pdf function, which accepts two parameters: the content of the HTML file and the path of the output PDF file. In the function, we first open a PDF file in binary writing mode, and then realize the conversion from HTML to PDF by calling the pisa.CreatePDF function. After the conversion is complete, close the file.

4. Working with CSS styles

The XHTML2PDF library supports the parsing and rendering of CSS styles, which allows us to easily apply styles from HTML files to generated PDF documents. Here is a sample code:

def convert_html_to_pdf(html_string, output_path):
    pdf_file = open(output_path, "wb")
    pisa_status = pisa.CreatePDF(html_string, dest=pdf_file, encoding="UTF-8",
                                css=open("style.css", "r").read())

    ...

# 调用转换函数
convert_html_to_pdf(html_content, "output.pdf")

We can apply styles to a PDF document by passing the content of the CSS file as an argument to the css parameter of the pisa.CreatePDF function. In the example above, we passed the contents of the style.css file to the function as CSS styles.

5. Work with images and fonts

The XHTML2PDF library also supports adding images and fonts to generated PDF documents. Here is a sample code:

def convert_html_to_pdf(html_string, output_path):
    pdf_file = open(output_path, "wb")
    pisa_status = pisa.CreatePDF(html_string, dest=pdf_file, encoding="UTF-8",
                                css=open("style.css", "r").read(),
                                images_path="images/",
                                fonts_path="fonts/")

    ...

# 调用转换函数
convert_html_to_pdf(html_content, "output.pdf")

In the sample code above, we specified the paths of image files and font files by passing the images_path and fonts_path parameters to the pisa.CreatePDF function. In this example, we assume that image files are stored in the "images/" directory and font files are stored in the "fonts/" directory. In this way, you can easily insert images and use custom fonts in the generated PDF documents.

6. Error handling

When using XHTML2PDF, we need to pay attention to error handling. The XHTML2PDF library provides a PMLCreationError class to represent errors that may occur when converting HTML to PDF. Here is a sample code:

def convert_html_to_pdf(html_string, output_path):
    pdf_file = open(output_path, "wb")
    pisa_status = pisa.CreatePDF(html_string, dest=pdf_file)

    if pisa_status.err:
        print(f"Error occurred while converting HTML to PDF: {pisa_status.err}")
    else:
        print("HTML converted to PDF successfully!")

# 调用转换函数
convert_html_to_pdf(html_content, "output.pdf")

In the above code, we check the err property in the return value of the pisa.CreatePDF function. If the err attribute is not empty, it means that an error occurred during the conversion, and we can print the error message for debugging.

in conclusion:

Through this article, we introduced the XHTML2PDF library and how to use it in detail, and how to convert HTML files into high-quality PDF documents. We learned the steps to install and configure XHTML2PDF, and provided sample code to help readers get started quickly. Along the way, we also discussed how to work with CSS styles, insert images, and use custom fonts. Hope this article can help readers to convert HTML to PDF using XHTML2PDF in Python.

 

Guess you like

Origin blog.csdn.net/naer_chongya/article/details/131771930