Java uses pdfbox to dynamically generate PDF

Apache PDFBox is a Java library for working with PDF documents. It provides many functions and methods to read, create, manipulate and extract the content of PDF documents.

Introduce maven dependency

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.24</version>
</dependency>

pdfbox generate pdf instance

  try {
       // 创建一个空白的PDF文档
       PDDocument document = new PDDocument();
       // 创建一个页面
       PDPage page = new PDPage(PDRectangle.A4);
       document.addPage(page);
       // 创建一个内容流
       PDPageContentStream contentStream = new PDPageContentStream(document, page);
       // 设置字体和字号
       contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
       // 在页面上绘制文本
       contentStream.beginText();
       contentStream.newLineAtOffset(100, 700);
       contentStream.showText("Hello, World!");
       contentStream.endText();
       // 关闭内容流
       contentStream.close();
       // 保存PDF文档
       document.save("output.pdf");
       // 关闭PDF文档
       document.close();
       System.out.println("PDF生成成功!");
   } catch (IOException e) {
       e.printStackTrace();
   }

common method

PDDocument class

Refer to the description of the PDDocument class in the source code

This is the in-memory representation of the PDF document

This is the memory representation of a PDF document. In a java program, you can simply understand that it is a pdf document, and a series of subsequent operations on it are a series of operations on the pdf document.

Create a brand new pdf document: no pages in the document

PDDocument document=new PDDocument();

If you want to fill the original pdf template with dynamic data, you can use the PDDocument.load() method to load the already made pdf template,

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getInputStream());

You can also load the pdf template as a file, but the file stream is more recommended

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getFile());

If you want to encrypt the generated pdf, you can use the PDDocument load(InputStream input, String password) method, and set the decrypted password to 123456 as follows.

PDDocument document = PDDocument.load(new ClassPathResource("/static/reportTemplate.pdf").getInputStream(),"123456");

There are many overloaded methods in PDDocument.load(), so I won't list them here. Those who are interested can view the source code of pdfbox,

ByteArrayOutputStream baos = new ByteArrayOutputStream();;
document.save(baos); //保存文件到文件流

document.save("output.pdf"); //保存文件到文件

After saving as a file stream, sometimes we need to transfer the file to the front end for downloading.

// 将PDF文件转换为字节数组
byte[] pdfBytes = baos.toByteArray();

// 创建InputStreamResource对象
ByteArrayInputStream bis = new ByteArrayInputStream(pdfBytes);
InputStreamResource resource = new InputStreamResource(bis);

// 设置HTTP响应头信息
HttpHeaders headers = new HttpHeaders();
       headers.add(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=output.pdf");
headers.add(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_PDF_VALUE);
// 返回带有PDF内容的响应实体
return ResponseEntity.ok()
       .headers(headers)
       .body(resource);

After completing the document operation, be sure to execute the document.close() method to close the pdf document.

document.close();

PDPage class

PDPage belongs to the pages in the pdf document,

int pageNumber=document.getNumberOfPages();

Get the specified page,

PDPage page = document.getPage(0);

If you are operating on a pdf template, you can use the document.getPage(index) method to obtain the specified page of the pdf document and operate on it (index starts from 0). You can also create a brand new page through new PDPage();

PDPage newPage = new PDPage(PDRectangle.A4);

If we generate a page page through new PDPage(), we need to add the page page to the pdf document (document),

document.addPage(newPage);

However, this method will add the page to the end of the pdf document. Sometimes we need to add the page to the specified location. The following method can be used.

PDPage page=document.getPage(1); //获取第2页
PDPage newPage = new PDPage(PDRectangle.A4);
PDPageTree pages = document.getPages();
pages.insertAfter(newPage,page); //插入到第2页后面
pages.insertBefore(newPage,page); //插入到第2页前面

Get the total height and width of the page, which is useful in the subsequent text coordinate positioning. In the page, the origin coordinates are located in the lower left corner. If you want your element to have a left margin of 10 and a top margin of 10, then your coordinates will be (10, pageHeight-10)

float pageWidth = page.getMediaBox().getWidth();
float pageHeight = page.getMediaBox().getHeight();

PDPageContentStream

The PDPageContentStream class provides the function of writing the page content stream, which needs to bind the pdf document and the specified page page, which is equivalent to creating the content stream of the current page of the page.

PDPageContentStream contentStream = new PDPageContentStream(document, page);

If PDPageContentStream.AppendMode is not specified, it will be executed in rewrite mode by default, and subsequent addition of elements to the page page will overwrite the existing page content stream.

PDPageContentStream contentStream = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true);

mode code

model

note

PDPageContentStream.AppendMode.OVERWRITE

rewrite mode

Overwrite existing page content flow

PDPageContentStream.AppendMode.APPEND

append mode

Appends the content stream after all existing page content streams

PREPENDPDPageContentStream.AppendMode.

ready mode

Inserted before all other page content flow

After the operation on the contentStream is completed, the content stream needs to be closed.

contentStream.close();

pdf write content

about fonts

In Apache PDFBox, font-related classes are mainly located under the org.apache.pdfbox.pdmodel.font package. Here are some commonly used font classes:

  1. PDType1Font: This class represents a Type 1 font, which is an outline-based font format. Type 1 fonts are commonly used in PDF documents, such as Helvetica, Times Roman, and Courier.

Example:

PDType1Font font = PDType1Font.HELVETICA_BOLD;
public static final PDType1Font TIMES_ROMAN = new PDType1Font("Times-Roman");
public static final PDType1Font TIMES_BOLD = new PDType1Font("Times-Bold");
public static final PDType1Font TIMES_ITALIC = new PDType1Font("Times-Italic");
public static final PDType1Font TIMES_BOLD_ITALIC = new PDType1Font("Times-BoldItalic");
public static final PDType1Font HELVETICA = new PDType1Font("Helvetica");
public static final PDType1Font HELVETICA_BOLD = new PDType1Font("Helvetica-Bold");
public static final PDType1Font HELVETICA_OBLIQUE = new PDType1Font("Helvetica-Oblique");
public static final PDType1Font HELVETICA_BOLD_OBLIQUE = new PDType1Font("Helvetica-BoldOblique");
public static final PDType1Font COURIER = new PDType1Font("Courier");
public static final PDType1Font COURIER_BOLD = new PDType1Font("Courier-Bold");
public static final PDType1Font COURIER_BOLD_OBLIQUE = new PDType1Font("Courier-BoldOblique");
public static final PDType1Font SYMBOL = new PDType1Font("Symbol");
public static final PDType1Font ZAPF_DINGBATS = new PDType1Font("ZapfDingbats");
  1. PDTrueTypeFont: This class represents a TrueType font, which is also an outline-based font format. TrueType fonts are also common in PDFs.

PDTrueTypeFont font = PDType1Font.TIMES_ROMAN;
  1. PDType0Font: This class represents a Type 0 font, which is a composite font format that can contain multiple subfonts. Type 0 fonts are usually used to support multi-language and complex glyph requirements, and you can use it to load your own custom font files.

PDType0Font font = PDType0Font.load(document, new ClassPathResource("/static/wryhRegular.ttf").getInputStream());

write a single line of text

contentStream.setFont(PDType1Font.COURIER_BOLD_OBLIQUE, 16);
contentStream.beginText();
contentStream.newLineAtOffset(50, pageHeight-50);
contentStream.showText("测试文本");
contentStream.endText();

Before writing text, you need to set the font and font size through the contentStream.setFont(PDFont font, float fontSize) method, start a new text paragraph through the beginText() method, and set the coordinate position of the text through the newLineAtOffset(x, y); method. Here, setting (50, pageHeight-50) means that the text position is located in the upper left corner, 50 units away from the top and left. Then display the text you need to display through showText(String text), and finally end the text paragraph with the endText() method.

Continuously write multiple lines of text

contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);

// 设置文本起始坐标
float startX = 50;
float startY = page.getMediaBox().getHeight() - 50;

// 设置行间距
float leading = 15;

// 写入多行文本
String[] lines = {
    "第一行文本",
    "第二行文本",
    "第三行文本"
};

contentStream.beginText();
contentStream.newLineAtOffset(startX, startY);

for (String line : lines) {
    contentStream.showText(line);
    contentStream.newLineAtOffset(0, -leading);
}

contentStream.endText();

The process of writing multi-line text is similar to that of single-line text. You need to set the font and font size first, and determine the coordinates of the written text. The difference is that we have executed showText() and newLineAtOffset() multiple times between the beginText() method and endText() method. Add multiple lines of text to a pdf document after many loops.

insert picture

PDImageXObject image = PDImageXObject.createFromFileByExtension(new File("path/to/image.jpg"), document);
float imageWidth = image.getWidth();
float imageHeight = image.getHeight();

PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(image, x, y, imageWidth, imageHeight);

Here we use the PDImageXObject.createFromFileByExtension() method to load the image file and create a PDImageXObject object. Make sure to replace "path/to/image.jpg" with the path of the actual image file. Here I set the width and height of the image to the width and height of the real image. In actual situations, you can also customize the width and height. Finally, write the image to the pdf document through the drawImage(image, x, y, imageWidth, imageHeight) method. x, y represent its xy coordinates, and imageWidth and imageHeight represent the width and height of the image respectively.

Add rectangle

//设置边框颜色
contentStream.setStrokingColor(new Color(213, 213, 213));
//设置边框宽度为1
contentStream.setLineWidth(1);
// 添加矩形框到页面内容流
contentStream.addRect(50, pageHeight-50, 100, 100);
// 绘制矩形框的边框
contentStream.stroke();
//恢复原来的颜色,否则会影响文字颜色
contentStream.setStrokingColor(Color.BLACK);

Common methods of text coordinate calculation

    /**
     * 获取字体高度
     * */
    float getFontHeight(PDType0Font customFont,float fontSize){
        return customFont.getFontDescriptor().getFontBoundingBox().getHeight() / 1000 * fontSize;
    }
    /**
    * 计算文本宽度
    * */
    float getTextWidth(String text,float fontSize){
        return fontSize * text.length();
    }

appendix

PDFBox Official Documentation (2.0.24)

Guess you like

Origin blog.csdn.net/weixin_44220970/article/details/131575826