[PDFBox] PDFBox operates PDF documents to add local pictures, add network pictures, picture width and height adaptive, picture horizontal and vertical center alignment

This article mainly introduces how PDFBox operates PDF documents, including adding local pictures, adding network pictures, adapting picture width and height, and aligning pictures horizontally and vertically.

Table of contents

1. PDFBox operation picture

1.1. Add local pictures

(1) Case code

(2) Operation effect

(3) Method introduction

1.2. Add network pictures

(1) Case code

(2) Operation effect

1.3. Image width and height adaptive (image scaling)

(1) Image scaling code

(2) Operation effect

1.4, read pictures

(1) Case code

(2) Operation effect


1. PDFBox operation picture

PDFBox can add image objects to PDF documents, use PDImageXObject to represent an image object, and operate on the content of PDF documents, all need to use the PDPageContentStream page content stream object to complete, PDFBox will all the text, images, Forms and other content are regarded as a stream, and operations such as adding, deleting, and modifying content are completed through streams. Here we first introduce how to use PDFBox to add image objects to PDF documents.

1.1. Add local pictures

(1) Case code

Add a local picture, that is, read the picture in the current disk, and then write this picture into the PDPageContentStream page content stream, the case code is as follows:

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 14:51
 * @Author ZhuYouBin
 * @Description: PDFBox操作图片
 */
public class PDFBoxImageUtil {

    /**
     * 将给定路径的图片,保存到pdf文件里面
     * @param imgPath 图片路径
     * @param destPdf 生成的pdf文件路径
     * @return 返回生成的pdf文件路径
     */
    public static String generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1、创建PDF文档对象
            PDDocument doc = new PDDocument();
            // 2、创建Page页面对象
            PDPage page = new PDPage(PDRectangle.A4);
            // 3、创建图片对象
            PDImageXObject image = PDImageXObject.createFromFile(imgPath, doc);
            // 4、创建页面内容流,指定操作哪个文档中的哪个页面
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            stream.drawImage(image, 10, 10); // 绘制图片到PDF页面里面
            stream.close(); // 关闭页面内容流
            doc.addPage(page); // 添加页面到PDF文档
            doc.save(destPdf); // 保存PDF文档
            doc.close(); // 关闭PDF文档
        } catch (Exception e) {
            e.printStackTrace();
        }
        return destPdf;
    }

    public static void main(String[] args) {
        String imgPath = "E:\\demo\\001.jpg";
        String destPdf = "E:\\demo\\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

(2) Operation effect

(3) Method introduction

Some static methods are mentioned in the PDImageXObject class, the common ones are as follows:

  • createFromFile(imagePath, doc) method: read the image in the local disk by means of a File file.
    • imagePath parameter: the path of the image.
    • doc parameter: PDF document object.
  • getImage() method: returns the BufferedImage image object.
  • getSuffix() method: returns the suffix type of the image, for example: jpg, png, etc.

1.2. Add network pictures

PDFBox does not provide a method to read network pictures, but the following method can be used to realize the function of reading network pictures, the idea is as follows:

  • Step 1: Use the URL object to download the network image to the local disk.
  • Step 2: Use the createFromFile() method to read the network image just downloaded from the local disk.

(1) Case code

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.UUID;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:01
 * @Author ZhuYouBin
 * @Description: PDFBox操作图片,添加网络图片到PDF文档
 */
public class PDFBoxImageUtil {

    /**
     * 将给定路径的图片,保存到pdf文件里面
     * @param imgPath 图片路径
     * @param destPdf 生成的pdf文件路径
     * @return 是否生成成功
     */
    public static String generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1、创建PDF文档对象
            PDDocument doc = new PDDocument();
            // 2、创建Page页面对象
            PDPage page = new PDPage(PDRectangle.A4);
            // 3、创建图片对象
            PDImageXObject image;
            boolean isTemp = false;
            String tempPath = null;
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                isTemp = true;
                tempPath = downloadImage(imgPath, null);
                image = PDImageXObject.createFromFile(tempPath, doc);
            } else {
                image = PDImageXObject.createFromFile(imgPath, doc);
            }
            // 4、创建页面内容流,指定操作哪个文档中的哪个页面
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            stream.drawImage(image, 10, 10); // 绘制图片到PDF页面里面
            stream.close(); // 关闭页面内容流
            doc.addPage(page); // 添加页面到PDF文档
            doc.save(destPdf); // 保存PDF文档
            doc.close(); // 关闭PDF文档
            // 图片添加成功之后需要删除本地临时文件
            if (isTemp) {
                new File(tempPath).delete();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return destPdf;
    }

    /**
     * 下载网络图片到本地
     * @param imgPath 网络图片地址
     * @param fileName 文件名称
     * @return 返回本地图片的临时路径
     */
    public static String downloadImage(String imgPath, String fileName) {
        try {
            URLConnection conn = new URL(imgPath).openConnection();
            String contentType = conn.getContentType();
            System.out.println(contentType);
            // 创建临时文件目录保存图片
            File file = new File("temp");
            if (!file.exists() && !file.mkdirs()) {
                throw new RuntimeException("临时目录创建失败");
            }
            if (fileName == null || fileName.trim().equals("")) {
                fileName = UUID.randomUUID().toString();
            }
            InputStream is = conn.getInputStream();
            byte[] data = new byte[1024];
            int len;
            // 下载文件到本地临时目录
            switch (contentType) {
                case "image/jpeg":fileName += ".jpeg"; break;
                case "image/gif": fileName += ".gif"; break;
                case "image/webp":
                case "image/png": fileName += ".png"; break;
            }
            fileName = file.getAbsolutePath() + File.separator + fileName;
            FileOutputStream fos = new FileOutputStream(fileName);
            while ((len = is.read(data)) != -1) {
                fos.write(data, 0, len);
            }
            fos.close();
            is.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return fileName;
    }

    public static void main(String[] args) {
        String imgPath = "https://www.toopic.cn/public/uploads/small/1658043938262165804393852.jpg";
        String destPdf = "E:\\demo\\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

(2) Operation effect

1.3. Image width and height adaptive (image scaling)

We have been able to add pictures to PDF documents before, but we can find that when the size of the pictures we add is too large, the part exceeding the PDF document will be blocked. How to solve this problem? ? ? For this problem, you can use the way of zooming pictures to solve the idea as follows:

  • Step 1: Obtain the actual width and height of the image (the width and height unit of the image obtained in JDK is [px], you need to convert [px] into [pt] unit, conversion rule: 1pt = 3/4 px ).
  • Step 2: Obtain the width and height of the PDF document (the width and height obtained in PDFBox use [pt] as the unit).
  • Step 3: Compare the actual width and height of the image with the width and height of the PDF document to calculate the zoom ratio.

(1) Image scaling code

package pdfbox.demo.image;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.UUID;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:11
 * @Author ZhuYouBin
 * @Description: PDFBox操作图片,图片宽高自动缩放
 */
public class PDFBoxImageUtil {

    /**
     * 将给定路径的图片,保存到pdf文件里面
     *
     * @param imgPath 图片路径
     * @param destPdf 生成的pdf文件路径
     * @return 返回生成的pdf文件路径
     */
    public static boolean generateImageToPdf(String imgPath, String destPdf) {
        try {
            // 1、创建PDF文档对象
            PDDocument doc = new PDDocument();
            // 2、创建Page页面对象
            PDPage page = new PDPage(PDRectangle.A4);
            // 3、创建图片对象
            PDImageXObject image;
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                String tempPath = downloadImage(imgPath, null);
                image = PDImageXObject.createFromFile(tempPath, doc);
                imgPath = tempPath;
            } else {
                image = PDImageXObject.createFromFile(imgPath, doc);
            }
            // 4、创建页面内容流,指定操作哪个文档中的哪个页面
            PDPageContentStream stream = new PDPageContentStream(doc, page);
            // 获取图片的宽高
            float[] imageWH = getImageWH(imgPath, page.getMediaBox());
            stream.drawImage(image, imageWH[0], imageWH[1], imageWH[2], imageWH[3]); // 绘制图片到PDF页面里面
            stream.close(); // 关闭页面内容流
            doc.addPage(page); // 添加页面到PDF文档
            doc.save(destPdf); // 保存PDF文档
            doc.close(); // 关闭PDF文档
            return true;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return false;
    }

    /**
     * 获取图片的宽度、高度,单位是【pt】
     *
     * @param imgPath 图片路径
     * @param box     PDF文档页面矩形区域对象,可以获取到矩形区域的宽高
     * @return 返回缩放之后的图片宽高
     */
    public static float[] getImageWH(String imgPath, PDRectangle box) {
        try {
            File file = new File(imgPath);
            InputStream is = new FileInputStream(file);
            // 判断是不是网络上的图片
            if (imgPath.startsWith("http://") || imgPath.startsWith("https://")) {
                is = new URL(imgPath).openStream();
            }
            BufferedImage bi = ImageIO.read(is);
            // px 转换成 pt 单位
            float xAxis;
            float yAxis;
            int w = bi.getWidth();
            int h = bi.getHeight();
            float width = (float) (w * 3.0 / 4); // 这里是因为 1pt = 3/4 px,pt和px单位转换
            float height = (float) (h * 3.0 / 4);
            float pw = box.getWidth() - 60; // 这里减不减60没啥关系,只是设置一下空白间距
            float ph = box.getHeight() - 60; // 这里减不减60没啥关系,只是设置一下空白间距
            if (width > pw) {
                float scale = pw / width;  // 缩放比列
                width = pw; // 宽度等于页面宽度
                height = height * scale; // 高度自动缩放
            } else {
                float scale = ph / height;  // 缩放比列
                height = ph; // 高度等于页面高度
                width = width * scale;  // 宽度自动缩放
            }
            // 计算图片在X、Y轴上的显示位置
            xAxis = (box.getWidth() - width) / 2; // X轴居中对齐
//            yAxis = box.getHeight() - height - 10; // 距离页面顶部10个pt
            yAxis = (box.getHeight() - height) / 2; // Y轴垂直居中对齐
            return new float[]{xAxis, yAxis, width, height};
        } catch (Exception e) {
            e.printStackTrace();
        }
        return new float[]{0, 0, 0, 0};
    }

    /**
     * 下载网络图片到本地
     * @param imgPath 网络图片地址
     * @param fileName 文件名称
     * @return 返回本地图片的临时路径
     */
    public static String downloadImage(String imgPath, String fileName) {
        try {
            URLConnection conn = new URL(imgPath).openConnection();
            String contentType = conn.getContentType();
            // 创建临时文件目录保存图片
            File file = new File("temp");
            if (!file.exists() && !file.mkdirs()) {
                throw new RuntimeException("临时目录创建失败");
            }
            if (fileName == null || fileName.trim().equals("")) {
                fileName = UUID.randomUUID().toString().replaceAll("-", "");
            }
            InputStream is = conn.getInputStream();
            byte[] data = new byte[1024];
            int len;
            // 下载文件到本地临时目录
            switch (contentType) {
                case "image/jpeg":fileName += ".jpeg"; break;
                case "image/gif": fileName += ".gif"; break;
                case "image/webp":
                case "image/png": fileName += ".png"; break;
            }
            fileName = file.getAbsolutePath() + File.separator + fileName;
            FileOutputStream fos = new FileOutputStream(fileName);
            while ((len = is.read(data)) != -1) {
                fos.write(data, 0, len);
            }
            fos.close();
            is.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return fileName;
    }

    public static void main(String[] args) {
        String imgPath = "https://www.toopic.cn/public/uploads/small/1658043938262165804393852.jpg";
        String destPdf = "E:\\demo\\img.pdf";
        generateImageToPdf(imgPath, destPdf);
    }
}

(2) Operation effect

1.4, read pictures

PDFBox can also read pictures from PDF documents, and then save them to the local disk. To save pictures, you can use the ImageIO class provided in JDK. This class provides a write() method, which can write picture objects to File inside the file.

(1) Case code

package pdfbox.demo.image;

import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDResources;
import org.apache.pdfbox.pdmodel.graphics.PDXObject;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;

/**
 * @version 1.0.0
 * @Date: 2023/7/15 15:11
 * @Author ZhuYouBin
 * @Description: PDFBox操作图片,从PDF文档中【读取图片】,并且保存到本地
 */
public class PDFBoxImageUtil {

    /**
     * 从给定的pdf文档里面,获取指定页面中的所有图片,并且保存到本地目录下
     * <p>
     *     pdf文档中的图片都是BASE64编码,我们能够获取到的也就只能是图片对应的BASE64字符串。
     *     所以,还需要将图片的BASE64字符串编码转换成对应的图片文件
     * </p>
     * @param pdfPath PDF文档路径
     * @param imagePath 生成的图片路径以及名称
     * @param pageNum 获取第几页的图片
     * @return 返回提取的图片本地路径
     */
    public static String readerImageFromPdf(String pdfPath, String imagePath, int pageNum) {
        try {
            // 1、加载PDF文档
            PDDocument doc = PDDocument.load(new File(pdfPath));
            // 2、遍历所有Page页面,找到指定的page页面获取图片
            int pages = doc.getNumberOfPages();
            for (int i = 0; i < pages; i++) {
                if (i != pageNum) {
                    continue;
                }
                // 获取当前Page页面
                PDPage page = doc.getPage(i);
                // 获取对应页面的资源对象
                PDResources resources = page.getResources();
                // 遍历当前页面所有内容,找出图片对象
                for (COSName cosName : resources.getXObjectNames()) {
                    PDXObject pdxObject = resources.getXObject(cosName);
                    // 判断是不是图片对象
                    if (pdxObject instanceof PDImageXObject) {
                        // 获取图片对象
                        BufferedImage image = ((PDImageXObject) pdxObject).getImage();
                        // 保存到本地磁盘里面
                        ImageIO.write(image, "JPEG", new File(imagePath));
                    }
                }
            }
            doc.close(); // 关闭PDF文档
        } catch (Exception e) {
            e.printStackTrace();
        }
        return imagePath;
    }

    public static void main(String[] args) {
        String imgPath = "E:\\img\\002.jpg";
        String destPdf = "E:\\demo\\img.pdf";
        readerImageFromPdf(destPdf, imgPath, 0);
    }
}

(2) Operation effect

At this point, the introduction of PDFBox operation pictures is over.

In summary, this article is over. It mainly introduces PDFBox to operate PDF documents to add local pictures, add network pictures, picture width and height adaptation, and picture horizontal and vertical center alignment.

Guess you like

Origin blog.csdn.net/qq_39826207/article/details/131739142