Apache PDFbox Quick Development Guide

Apache PDFbox Quick Development Guide

Author: chszs, reprint should be noted. Blog homepage: http://blog.csdn.net/chszs

1. Introduction

Apache PDFbox is an open source, Java-based tool library that supports PDF document generation. It can be used to create new PDF documents, modify existing PDF documents, and extract desired content from PDF documents. Apache PDFBox also includes several command line tools.
Apache PDFbox released the latest version 1.8.2 not long ago.

2. Features

Apache PDFBox mainly has the following features:
1) Text extraction: Extract text from PDF documents.
2) Merge & Split: You can merge multiple PDF documents into a single one, or split a single PDF into multiple PDF documents.
3) Form filling: You can extract data from PDF forms, or fill PDF forms.
4) PDF/A Verification: Verifies whether the PDF document meets the PDF/A ISO standard.
5) PDF printing: output the PDF document to the printer - using Java's printing API.
6) PDF conversion: PDF documents can be converted into image files.
7) PDF Creation: A new PDF document can be created from scratch.
8) Integrated Lucene search engine: Lucene search engine is integrated with PDF indexing.

3. Development practice

Since Apache PDFbox is a PDF tool library, the most important example is to use it to create a PDF document. Here we begin the process.

1. Create a Java project

Create a Java project under Eclipse named PDFboxDemo.

2. Download PDFbox package

Address:
1) pdfbox-1.8.2.jar
Address: http://archive.apache.org/dist/pdfbox/1.8.2/pdfbox-1.8.2.jar
Description: Meets general PDF operation requirements.
2) pdfbox-app-1.8.2.jar
http://archive.apache.org/dist/pdfbox/1.8.2/pdfbox-app-1.8.2.jar
Description: PDFbox toolkit for multiple command lines.
3) fontbox-1.8.2.jar
address: http://archive.apache.org/dist/pdfbox/1.8.2/fontbox-1.8.2.jar
Description: The font package used by PDF
Therefore, this example uses 1. 3 items will do.

3. Create the class file

First create the chszs.pdf source package, and create the class file CreatePDF.java in this package.

[java]  view plain copy  
 
 print ?
  1. package chszs.pdf;  
  2.   
  3. // import java.io.File;  
  4. import java.io.IOException;  
  5.   
  6. import org.apache.pdfbox.exceptions.COSVisitorException;  
  7. import org.apache.pdfbox.pdmodel.PDDocument;  
  8. import org.apache.pdfbox.pdmodel.PDPage;  
  9. import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;  
  10. import org.apache.pdfbox.pdmodel.font.PDFont;  
  11. //import org.apache.pdfbox.pdmodel.font.PDTrueTypeFont;  
  12. import org.apache.pdfbox.pdmodel.font.PDType1Font;  
  13.   
  14. public class CreatePDF {  
  15.     public static void main(String[] args) throws IOException{  
  16.         PDDocument document = new PDDocument();  
  17.         PDPage page = new PDPage();  
  18.         document.addPage(page);  
  19.           
  20. //      PDFont font = PDTrueTypeFont.loadTTF(document, new File("SIMSUN.TTC"));  
  21.         PDFont font = PDType1Font.HELVETICA_BOLD;  
  22.           
  23.         PDPageContentStream contentStream = new PDPageContentStream(document, page);  
  24.         contentStream.beginText();  
  25.         contentStream.setFont(font, 14);  
  26.         contentStream.moveTextPositionByAmount(100700);  
  27.         contentStream.drawString("Hello World");  
  28. //      contentStream.drawString("中文");  
  29.         contentStream.endText();  
  30.           
  31.         contentStream.close();  
  32.           
  33.         try {  
  34.             document.save("E:/test.pdf");  
  35.         } catch (COSVisitorException e) {  
  36.             e.printStackTrace();  
  37.         }  
  38.         document.close();  
  39.     }  
  40. }  


执行程序,在磁盘E盘产生test.pdf文件。

总结说明:至Apache PDFbox 1.8.2版,仍然不支持中文PDF的创建,比iText的功能要弱很多。

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327033901&siteId=291194637