Use pdf-lib to process PDF in Node.js

The pdf-lib npm module is a good tool for creating and editing PDFs using Node.js. Puppeteer is an excellent tool that can generate PDF from HTML , but unfortunately, in my experience, the browser 's support for print layout in CSS is not very good. The pdf-lib module gives you very fine control over PDF. It can be used to merge PDFs, add page numbers, watermarks, split PDFs, and any other functions that you can use to process PDF files using the ILovePDF API .

getting Started

Let's use pdf-lib to create a simple PDF document. This PDF document has only one page, and the icon of Mastering JS will be displayed in the middle of the page.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  // Create a new document and add a new page
  const doc = await PDFDocument.create();
  const page = doc.addPage();

  // Load the image and store it as a Node.js buffer in memory
  let img = fs.readFileSync('./logo.png');
  img = await doc.embedPng(img);

  // Draw the image on the center of the page
  const { width, height } = img.scale(1);
  page.drawImage(img, {
    x: page.getWidth() / 2 - width / 2,
    y: page.getHeight() / 2 - height / 2
  });

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await doc.save());
}

Running the above script will generate the following PDF. Using pdf-lib is very simple, with only a few pitfalls: note PDFDocument#embedPng()and PDFDocument#save()return Promise, so you need to use it await.

A simple PDF

Merge PDF

The killer feature of pdf-lib is that you can modify existing PDFs, not just create new ones. For example, suppose you have two PDFs: one contains the cover of an e-book, and the other contains the content of the e-book. How to merge two PDFs? I used the ILovePDF API in the last eBook ( Mastering Async / Await ), but pdf-lib makes this task easy in Node.js.

There are two PDF files: cover.pdfand page-30-31.pdf. The following script, using pdf-lib PDF merge the two to a test.pdffile.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  // Load cover and content pdfs
  const cover = await PDFDocument.load(fs.readFileSync('./cover.pdf'));
  const content = await PDFDocument.load(fs.readFileSync('./page-30-31.pdf'));

  // Create a new document
  const doc = await PDFDocument.create();

  // Add the cover to the new doc
  const [coverPage] = await doc.copyPages(cover, [0]);
  doc.addPage(coverPage);

  // Add individual content pages
  const contentPages = await doc.copyPages(content, content.getPageIndices());
  for (const page of contentPages) {
    doc.addPage(page);
  }

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await doc.save());
}

The effect after merging can be seen in the figure below.

Merge PDF

Add page number

One of the biggest difficulties in using Puppeteer to generate PDFs from HTML is the pain of adding page numbers . Although adding page numbers seems simple, the CSS print layout cannot implement this function correctly. You can take a look at a for loop I wrote that uses a hard-code pixel offset to make the page number display correctly.

For example, Mastering Async / Await the PDF of the front four did not Page: ./content.pdf. The following script will add pages to each page in the PDF.

const { PDFDocument, StandardFonts, rgb } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  const content = await PDFDocument.load(fs.readFileSync('./content.pdf'));

  // Add a font to the doc
  const helveticaFont = await content.embedFont(StandardFonts.Helvetica);

  // Draw a number at the bottom of each page.
  // Note that the bottom of the page is `y = 0`, not the top
  const pages = await content.getPages();
  for (const [i, page] of Object.entries(pages)) {
    page.drawText(`${+i + 1}`, {
      x: page.getWidth() / 2,
      y: 10,
      size: 15,
      font: helveticaFont,
      color: rgb(0, 0, 0)
    });
  }

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await content.save());
}

The effect after adding the page number can be seen in the figure below
Add page number

carry on

There are many excellent libraries in the Node.js ecosystem that can solve almost any problem you can think of. The pdf-lib module allows you to process PDFs, sharp allows you to handle almost everything with images, pkg bundles Node projects into separate executable files, and so on. Before you start looking for online APIs to solve your problems, if you try to search npm first, you may find a better solution.

原文:Working With PDFs in Node.js Using pdf-lib

Guess you like

Origin www.cnblogs.com/tianliupingzong/p/12703007.html
pdf