How to convert web pages into PDF with Vue. Implementation steps and problem solving:

Implementation steps: (If you just want to know about vue implementation, you don’t need to look at method one, just look at method two)

method 1:

Using node.js and puppeteer (Google automatic detection tool), since the results of the first attempt are not ideal, I will give a rough explanation directly: (The original code is implemented as follows. Not all dependent packages are useful. Puppeteer is necessary and needs to be introduced by npm. )

const puppeteer = require('puppeteer');

const useProxy = require('puppeteer-page-proxy');

const {delay} = require("bluebird");

const Promise = require("bluebird");

const ms = require("ms");

const fs = require('fs');



(async () => {

    const browser = await puppeteer.launch();

    // const browser = await puppeteer.launch({headless:false});

    // const page = await browser.newPage();

    const page = await browser.newPage();

    await page.goto('http://127.0.0.1:5173/',

    {waitUntil:'networkidle2'}

    );

    // await delay(ms("5s"));

    await page.pdf({path: './test123.pdf' , format: 'A4',printBackground:true});

    await browser.close();

})();

The implementation principle is very simple, that is, automatically open the web page through puppeteer, and then call its pdf method to save it. It is worth noting that the printBackground: true parameter controls the display of the background color. If true is not selected, the background color will not be rendered. The advantage of this is that the generation is simple, the pictures are very clear, even after zooming in, they are still clear, and there is not too much memory. The original images of my three pictures are more than 3m, and the generated file is 3.56m. It can be seen that it is more consistent. expected. Disadvantages: ①The generated PDF has two pages by default. I tried many methods to solve it, but it may be due to poor academic skills. I couldn’t find the corresponding parameter settings in Puppeteer’s API document, and Baidu search and other methods did not find a good solution. It is still unclear. The reason is that I tried other websites, and some pages could be displayed completely. I won’t describe the specific reasons. Anyway, the author may have fallen into a dead end. After trying many times to no avail, I chose Vue to solve it. ② Since this requires opening a new window, for example, there will be a jump operation. Although there is a headless mode, I have not tried it because I have been persuaded by ①.

Method 2:

Method reference source:

How to generate PDF from Vue page_vue generates pdf_Angry Little Sheep’s Blog-CSDN Blog

In the vue project, first install the two dependencies needed: html2canvas and jspdf

①npm install --save html2canvas

②npm install jspdf --save

①The function is to convert the html page that we need to convert into PDF first into canvas (canvas is a tag of html, which is very important in drawing, special effects and even mini-games. To understand, you can go to the relevant series of knowledge, no detailed explanation) into After canvas, we can adjust the pdf we need to convert through a series of parameters. This is a very important point, because by default, the generated canvas is very blurry. The problem and solution will be discussed in detail later.

The function of ② is to convert the image converted from canvas into the required pdf and export it. There is not much that can be done here, and it is relatively simple.

Next are the specific implementation steps: (Refer to the blogger Angry Little Sheep. Thank you very much for sharing. You can do it based on the little sheep's code first. Since my code scenario is different from his, mine is not a sample code. Maybe There may be cases where it cannot be executed, but the sheep one is possible. I tried it.)

<template>

<button @click="handleExport">导出</button>

<div ref="pdf" class="spec">

需要转换成pdf的结构或者图片

</div>

</template>

Then define the click function, and get the pdf of the corresponding page and pass it to the downloadPDF function. Since the downloadPDF structure is more expensive, it is extracted and saved in pdf.js.

const handleExport =()=>{

            console.log(proxy.$refs.pdf)

            downloadPDF(proxy.$refs.pdf)

        }

Then import js, and extract it because pdf.js has a lot of content.

import {downloadPDF} from "./pdf.js"

pdf.js内容如下：

import html2canvas from "html2canvas";

import jsPDF from "jspdf";

import compress from './compress.js';



function base64ToFile(dataURL) {

  var arr = dataURL?.split?.(',')

  let mime = arr[0].match(/:(.*?);/)[1]

  let bstr = atob(arr[1]), n = bstr.length, u8arr = new Uint8Array(n);

  while (n--) {

    u8arr[n] = bstr.charCodeAt(n);

  }

  let filename = new Date().getTime() + "" + Math.ceil(Math.random() * 100) + "." + mime.split("/")[1]

  return (new File([u8arr], filename, { type: mime }))

}




export const downloadPDF = page => {

  html2canvas(page,{

    allowTaint: true, //开启跨域

    useCORS: true,

    scale: 2,

  }).then(function(canvas) {

    canvas2PDF(canvas);

  });

};

const canvas2PDF = canvas => {

  let contentWidth = canvas.width*0.2;

  let contentHeight = canvas.height*0.2;

  let imgHeight = contentHeight;

  let imgWidth = contentWidth;

  let pdf = new jsPDF("p", "pt");

  let sharePic

  sharePic = canvas.toDataURL("image/jpeg", 1)

  let fileba = base64ToFile(sharePic)

  compress(fileba)

    .then(res => {

      pdf.addImage(

        res.compressBase64,

        "JPEG",

        0,

        0,

        imgWidth,

        imgHeight

      );

      // console.log(pdf,999)

      pdf.save("导出.pdf");

    })

    .catch(err => {

    // error(err);

    });

};

At this point, the corresponding html code can be converted into pdf. Next, I will explain the problems and solutions I encountered:

Problems and solutions:

①The printed PDF has no picture part

The first problem I encountered is that the printed pdf has no pictures. I tried it and it can be printed directly with nodejs+puppeteer. I guess the method of converting to pdf is different. Maybe puppeteer is similar to the screenshot (guessing point of view) , and Vue, which converts to canvas first, needs to download image resources. All downloaded resources have cross-domain problems. At first, I wanted to implement it by configuring the proxy (the cross-domain issues and the proxy configuration method will be explained in detail later) , to help myself review and summarize and send it out), but the result was not very smooth, or it may be a problem with my operation, so I found through searching that I can convert the image into base64 format to avoid cross-domain problems, http:/ /nomad-public.oss-cn-shanghai.aliyuncs.com/size_chart/34e4debd-a8ea-4730-acd2-8a58cc7c6b5d.jpg?time=1677729826618, I roughly observed that it seems that a timestamp is added, and then in front of the picture Added image.setAttribute("crossorigin", "anonymous"); is said to be a solution to the cross-domain policy. Next, I will post the method of converting the image to base64:

const downloadImage = (imgsrc) => {//下载图片地址和图片名（下载部分代码已经被我删除了，这是上一个需求用到了，我在此基础魔改了一下）

        var image = new Image();

        // 解决跨域 Canvas 污染问题,

        image.setAttribute("crossorigin", "anonymous");

        image.onload = function () {

        var canvas = document.createElement("canvas");

        canvas.width = image.width;

        canvas.height = image.height;

        var context = canvas.getContext("2d");

        context.drawImage(image, 0, 0, image.width, image.height);

        var url = canvas.toDataURL("image/png"); //将图片格式转为base64

        // console.log(url,"base64")

        };

        image.src = imgsrc + '?time=' + Date.now();  //注意，这里是灵魂，否则依旧会产生跨域问题

        console.log(image.src,"image.src")

        return image.src

        }

The function of this function is to pass in a url, and the function will return the image url converted to base64 format (the concept may be wrong, because I have a weak understanding of base64 and canvas), and then convert it to pdf and you will find that it can be viewed Now to the picture, by the way, if it is a background image format, you can use the template string to add it in a dynamic form in the template: style="{'margin-top':'0','background-image':` url(${downloadImage('https://nomad-public.oss-cn-shanghai.aliyuncs.com/size_chart/929c3c47-e0b6-4765-bda9-5fb8c1c05191.jpg')})`}", this way you can avoid Now set the dynamic css style, which solves the first problem, the image cannot be displayed in the pdf.

②The resolution of the generated PDF is too low

After I printed out the PDF, I discovered another problem, that is, the resolution of my PDF was too low, like a mosaic, so I went to search for solutions to the problem. There are two solutions, one is to increase the scale (actually we are in css It is often seen that it is the same as increasing the proportion). I tried it and it did work, but at the same time, the picture became very large. Since the requirement is to be in an A4 paper size, increase the scale: 4 and then the font, etc. The clarity is much higher, but the paper can only display less than a quarter of the content (the upper left corner of the generated PDF), so the entire PDF cannot be displayed in A4. Obviously it is not as good as this, so I found another one Parameter dpi, the description of the entire parameter is very consistent with my expectations, but it is useless. I don’t know what the problem is. It may be the version, because I found that there is no such parameter in the parameters of html2canvas. In short, this way (The simplest) couldn't work, so under the guidance of my boss, I came up with a new idea, which is to set scale: 2 when converting html to canvas to make the converted canvas drawing board larger, and then convert it to After taking the picture, reduce the size of the picture to increase the clarity (in fact, it is the same idea as the dpi implementation, but more troublesome because the dpi parameter is invalid)

html2canvas(page,{

    allowTaint: true, //开启跨域

    useCORS: true,

    scale: 2,

  }).then(function(canvas) {

    canvas2PDF(canvas);

  });

Then in this code snippet, reduce the size of the image. This method is commonly known as enlarging first and then reducing it. First enlarge the canvas (drawing board), and then reduce the size of the image to increase clarity after converting it into an image. This step can be understood in a popular way as, you First draw the required picture under a huge drawing board. Since the drawing board is very large, even if the drawing is not very clear, after the drawing is completed, reduce the picture to the size of an A4 paper, so the pixels will be greatly increased in the same area. , because a huge drawing board is compressed onto a small A4 paper, the pixels are naturally higher (this is my personal understanding)

const canvas2PDF = canvas => {

下面位置就是width*0.2

  let contentWidth = canvas.width*0.2;

  let contentHeight = canvas.height*0.2;

  let imgHeight = contentHeight;

  let imgWidth = contentWidth;

Then I still found that the clarity was low, and then I came up with a second idea, which was to enlarge the html code structure. For example, it was originally placed in a 600px box and changed the box to 1200px. In fact, the idea is the same as before, one is to increase the drawing board to To improve accuracy, this means spending more time when painting, so the self-heating of the drawing board will naturally increase. It can also be understood that when the html is reduced to the original 600px, the pixels will definitely increase. In my opinion, these two ideas very similar. Okay, the image definition currently generated is already very high, which can basically meet the company's needs. Next, there is another problem.

③The generated PDF file is too large

Since in the previous method, modifying the html code css style to increase the clarity will bring a size, the original picture will also be enlarged, so if you simply reduce it and then reduce it, the picture will be very large. I tried If it is not processed, the size of the generated PDF when converted to jpeg is about 13m, and the size of the original image is only 3m. This shows how big the problem is. This PDF file transmission process will waste a lot of resources, so I started to study it. For compression, first I optimized the code structure and made very little change to the file size, so I started with pictures. When the incoming picture was about 20kb, the generated PDF was still about 3.8m, so in the code canvas.toDataURL("image/ jpeg", 1) I changed parameter 1 to 0.92. 1 means 100% restoration of the image, and 0.92 means sacrificing clarity for file size. The effect is very significant, reducing it from 3.8m to 1m, which is in line with expectations, but I The resolution of the picture has been changed, which is obviously to tear down the east wall to make up for the west wall. My boss also told me that the resolution is relatively low, and I hope to maintain the clarity as much as possible to 1m. In my initial understanding, it is The clearer the file, the larger it will be. Although I have also seen software for compressing PDFs on the market, in the free case, the resolution is sacrificed in exchange for a reduction in file size, except for the paid case, so I didn’t know what to do at first, so I went there again. I asked the boss who took me for advice. The boss told me directly, yes, you can compress the image as small as you want without reducing the definition. So my boss gave me a .js compression method and I tried it. I directly compressed the 3M file to 400kb. Since the code was given to me by the boss, I will not display it publicly without permission. The code I wrote and the code of the reference article are all posted. If you have similar needs, you can send me a private message. I can send it privately, thank you for sharing. The method is probably to pass in the file, then convert the image to base64, then use canvas to process it, and then reduce it through some methods that I haven't researched clearly yet. If I search for public code on the Internet, I will share it in the article and add it.

Final effect display:

Special thanks: Thank you to a good brother for your valuable opinions. I have modified the code into code blocks. I am inexperienced, so please forgive me. If there is anything else that is inconvenient to read, please point it out.