During the development process, it is required PDF
to provide preview and download functions for invoices of type , and PDF
the sources of type files include H5 移动端
and PC 端
, and the processing for these two different ends will be slightly different, which will be mentioned below.
PDF 预览
There are many articles for , but none of them seem to mention the possible problems, or how to choose the corresponding specific demand scenarios. Therefore, the core of this article is to combine the actual demand scenarios to see the current various implementation solutions. Which one is more suitable, of course, I hope that everyone can correct the content of the article in the comment area, or provide a better solution.
basic requirements:
pdf 文件
Supports full preview of content多页 pdf 文件
support分页查看
PC 端
and移动端
both need to support download and preview
product requirement:
- The preview on the PC side should support previewing on the current page
pdf 文件
The font in the preview should be consistent with the font
PDF preview
Putting aside the various requirements above, let's first summarize several common ways to implement PDF
preview :
- With the help of various class libraries, the preview is realized based on the code , such as the package based
pdfjs-dist
on - Directly based on the built-in
PDF
preview , such as<iframe src="xxx">、<embed src="xxx" >
- The server converts
PDF
the file into an image
Next, let's take a look at how the above solutions are implemented and whether they meet the requirements provided above!
<embed> / <iframe>
implement preview
<embed>
Label
<embed>
Element Embeds external content at a specified location in a document , provided by an external application or other source of interactive content such as a browser plug-in.
To put it simply, the resource displayed <embed>
using is a display function provided by the environment in which it is located, that is, if the current application environment supports the display of this resource, it can be displayed normally, and if it does not support it, it cannot be displayed.
It is also very simple to use:
<embed type="application/pdf" :src="pdfUrl" width="800" height="600" />
Most modern browsers have deprecated and canceled the support for browser plug-ins. It is no longer recommended to use
<embed>
tags , but<img>、<iframe>、<video>、<audio>
tags such as can be used instead.
<iframe>
Label
The method based <iframe>
on is similar to the above, and the overall effect is also the same, which is not shown here:
<iframe :src="pdfUrl" width="800" height="600" />
It is worth noting that even if you use <iframe>
but after actually expanding its inner structure, you will find:
Is it inside or <embed>
a tag ? What's going on here, doesn't it mean that it's best not to recommend <embed>
using ?
First, let's caniuse
check the compatibility, as follows:
Let's find another browser <embed>
that , for example IE
, to try the effect:
<iframe>
Try it instead , as follows:
Obviously, <embed>
it cannot be displayed directly in an incompatible environment, <iframe>
but can be recognized normally, but <iframe>
the resource loaded by can not be processed by IE
the browser , that is, the essential reason is that IE
the browser does not support the preview of PDF
similar , for example, when trying to directly http://127.0.0.1:3000/src/assets/2.pdf
When typing in the address bar you get:
Therefore, under normal circumstances, PDF
when , it should provide PDF
a fallback link, which is realized by downloading, and this is what pdfobject does. In fact, its source code content is relatively simple, and the core is PDFObject Will detect the browser's support for inline/embedded PDF, if it supports embedding, it will embed the PDF, if the browser does not support embedding, it will not embed the PDF, and provide a fallback link to the PDF, such as in the performance ofIE
:
In fact, this is just to help us write less compatible code, and it does not necessarily meet most people's scenarios. It is mentioned here only because of its connection <embed>
with .
vue3-pdfjs implementation preview
Why not use it directly pdfjs-dist
?
Several obvious points that can be complained about in pdf.js :
- The package name is not uniform,
npm
the package name on the website is calledpdfjs-dist
, but it is also calledReadme
inpdf.js
- There is no clear document as a guide, only the content of
examples
the directory as a reference - The official examples are not friendly enough, for example, no related examples
vue/react
are - Direct use needs to introduce a lot of content that is not specified in the document
- Sometimes the
pdf
text is blurry or missing parts, etc. - …
Therefore, since there is already a package based on vue/react
encapsulation , it is directly used as a demonstration here.
Specific use
The installation and use process can be referred to vue3-pdfjs
, the specific Vue3
sample code is as follows:
js
copy code
<script setup lang="ts">
import {
onMounted, ref } from 'vue'
import {
VuePdf, createLoadingTask } from 'vue3-pdfjs/esm'
import type {
VuePdfPropsType } from 'vue3-pdfjs/components/vue-pdf/vue-pdf-props'
// Prop type definitions can also be imported
import type {
PDFDocumentProxy } from 'pdfjs-dist/types/src/display/api'
import pdfUrl from './assets/You-Dont-Know-JS.pdf'
const pdfSrc = ref<VuePdfPropsType['src']>(pdfUrl)
const numOfPages = ref(0) onMounted(() => {
const loadingTask = createLoadingTask(pdfSrc.value)
loadingTask.promise.then((pdf: PDFDocumentProxy) => {
numOfPages.value = pdf.numPages }) }) </script>
<template>
<VuePdf v-for="page in numOfPages" :key="page" :src="pdfSrc" :page="page" />
</template>
<style> @import '@/assets/base.css';
</style>
The effect is as follows:
There is a problem
It pdf 文档
seems that the loading is normal . It seems that there is no big problem. Let’s try pdf 发票
to load it. However, because the actual invoice has more sensitive information, the original invoice content will not be posted here, and we will directly look at the previewed invoice content:
-
Obviously, the content is missing a lot . Although most of some invoices can be displayed, the parts such as invoice header and seal may not be displayed normally.
[ Note ] The complete content cannot be displayed because
pdf.js
it needs the support of some font libraries. If原 PDF 文件
some fonts in do not match the font library, they will not be displayedpdf.js
in , and the font library is placedcmaps
under the folder
- In addition, the previewed font is inconsistent with the actual font, and due to the particularity of the invoice, there is a greater requirement for the consistency of the font. After all, if the font of the same invoice is inconsistent, it will lack standardization and legality (` Required The statement when the font is consistent` )
Common solution: Solve the problem that pdf.js cannot fully display the content of the pdf file . In fact, it is still analyzed according to the error information of the execution environment, and the source code content needs to be forcibly modified.
Mozilla Firefox (Firefox browser)
The built-in PDF reader of Mozilla Firefox actually means that pdf.js
you can preview pdf
the file , as follows:
And most of the libraries based on pdf.js
secondary packaging usually cannot display the complete content when vue-pdf、vue3-pdfjs
previewing the invoice of pdf
the file , and need more or less changes to the source code, but theFirefox
built-in in can completely display the content of the corresponding file.pdf.js
pdf
PDF
Go 图片
to realize the preview
This method should go without saying. The core is that when the server responds to pdf
a file , it first converts it into an image type and then returns it. The front end can directly display the specific image content.
Implementation
The following is simulated node
by :
const pdf = require('pdf-poppler')
const path = require('path')
const Koa = require('koa')
const koaStatic = require('koa-static')
const cors = require('koa-cors')
const app = new Koa()
// 跨域
app.use(cors())
// 静态资源
app.use(koaStatic('./server'))
function getFileName(filePath) {
return filePath.split('/').pop().replace(/\.[^/.]+$/, '')
}
function pdf2png(filePath) {
// 获取文件名
const fileName = getFileName(filePath);
const dir = path.dirname(filePath);
// 配置参数
const options = {
format: 'png',
out_dir: dir,
out_prefix: fileName,
page: null,
}
// pdf 转换 png
return pdf.convert(filePath, options).then((res) => {
console.log('Successfully converted !')
return `http://127.0.0.1:4000${
dir.replace('./server','')}/${
fileName}-1.png` }).catch((error) => {
console.error(error) }) }
// 响应
app.use(async (ctx) => {
if(ctx.path.endsWith('/getPdf')){
const url = await pdf2png('./server/pdf/2.pdf')
ctx.body = {
url }
}else{
ctx.body = 'hello world!' } })
app.listen(4000)
avoid stepping on some pits
Pit 1: pdf-image is not recommended
When the server converts pdf
files into images, it needs to rely on some third-party packages. At the beginning pdf-image
, this package was used, but many abnormal errors occurred during the actual conversion. After checking the source code along with the errors, it was found that it needed to rely on some additional packages internally. The tool, because it needs to use pdfinfo xxx
related commands, and issue
there are some similar problems in its corresponding , but after trying all of them, it still fails!
Therefore, it is more recommended to pdf-poppler
use a pdftocairo
program that comes with it to realize the ability to convert pdf
from to pictures, but its current version supports Windows and Mac OS , as follows:
Pit 2: path.basename not a function
In the above code content, we need to get the name of the file. In fact, we can simply useNode Api
中path.basename(path[, suffix])
to achieve the goal:
However, the following exception , and the corresponding code content and running results are as follows:
js
copy code
// 配置参数 const options = { format: 'png', out_dir: dir, out_prefix: path.baseName(filePath, path.extname(filePath)), // 发生异常 page: null, }
I haven't found the reason for this yet. I can only simply implement a getFileName
method to get the name of the file.
Reason for error : too dependent on the automatic prompt of the editor, output basename as baseName, yes, it is the difference between n and N.
Pit 3: Details
The above content is koa
started simulating business services. Since the ports between business services ( http://127.0.0.1:4000
) and application services ( http://127.0.0.1:3000
) are inconsistent, cross-domain will occur , which can be solved by . It is worth noting that sometimes when the business server restarts, it may not be kick in.koa-cors
koa-cors
Since the content of the response is directly returned in koa
the general middleware, if you need to support business services to provide access to static resources , you can do koa-static
so through . It is worth noting that when you koa-static
specify static file resources through , such as app.use(koaStatic('./static'))
this If you directly http://127.0.0.1:4000/static/pdf/xxx.png
pass , you will get a 404 Not Found error. The reason is koa-static
that you directly set /static/ as the root path , so the correct access path is: http://127.0.0.1:4000/pdf/xxx.png
.
Effect demonstration
The content of the invoice is inconvenient to display and will not be displayed directly here. You only need to pay attention to the generated pictures and paths:
PDF download
The download here actually refers not only to the download pdf
of , but to the download methods supported by the client side. The most common ones are as follows:
- a tag , e.g.
<a href="xxxx" download="xxx">下载</a>
- location.href , e.g.
window.location.href = xxx
- window.open , for example
window.open(xxx)
- Content-disposition , such as
Content-disposition:attachment;filename="xxx"
<a>
Realize download
<a>
The download
attribute is used to instruct the browser to download the URL specified by href instead of navigating to the resource, and usually prompts the user to save it as a local file. If download
the attribute has specified content, this value will be used as a pre-populated value during the download and save process filename , mainly because of the following reasons:
- This value may be
JavaScript
dynamically Content-Disposition
ordownload
the attribute specified in takes precedence overa.download
This should be the most familiar method for everyone, but familiarity is familiar, and there are some points worth noting:
download
Attributes only apply to same-origin URLs- If a different file name is specified in the attribute in the HTTP response header, the content in will be used first
Content-Disposition
Content-Disposition
- HTTP If the HTTP response header
Content-Disposition
is set to , then the attribute ofContent-Disposition='inline'
will be used first in FirefoxContent-Disposition
download
Static way:
<a href="http://127.0.0.1:4000/pdf/2-1.png" download="2.pdf">下载</a>
Dynamic way:
function download(url, filename){
const a = document.createElement("a");
// 创建 a 标签
a.href = url;
// 下载路径
a.download = filename;
// 下载属性,文件名
a.style.display = "none"; // 不可见
document.body.appendChild(a); // 挂载
a.click(); // 触发点击事件
document.body.removeChild(a); // 移除
}
blob method
if (reqConf.responseType == 'blob') {
// 返回文件名
let contentDisposition = config.headers['content-disposition'];
if (!contentDisposition) {
contentDisposition = `;filename=${
decodeURI(config.headers.filename)}`;
}
const fileName = window.decodeURI(contentDisposition.split(`filename=`)[1]);
// 文件类型
const suffix = fileName.split('.')[1];
// 创建 blob 对象
const blob = new Blob([config.data], {
type: FileType[suffix], });
const link = document.createElement('a');
link.style.display = 'none';
link.href = URL.createObjectURL(blob);
// 创建 url 对象
link.download = fileName;
// 下载后文件名
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
// 移除隐藏的 a 标签
URL.revokeObjectURL(link.href);
// 销毁 url 对象
}
Content-disposition
and location.href/window.open
implement the download
This seems to be three downloading methods, but it is actually one, and it is still Content-disposition
based on .
Content-Disposition
The response header indicates in what form the content of the reply should be displayed , whether it should be displayed inline (that is, a part of a webpage or a page), or downloaded and saved locally as an attachment, as follows:
-
inline
: is the default value , indicating that the message in the reply will be displayed in the form of a part of the page or the entire pageContent-Disposition: inline
-
attachment
: Setting this value means that the message body should be downloaded locally, and most browsers will present a "save as" dialog box and pre-fillfilename
the value of as the downloaded file nameContent-Disposition: attachment; filename="filename.jpg"
Therefore location.href='xxx'
, window.open(xxx)
the way of downloading based on and is based on the form Content-Disposition: attachment; filename="filename.jpg"
of , or it triggers the download behavior of the browser itself, which meets this condition, whether it is through a
tag jump , location.href navigation , window.open to open a new page , Directly enter the URL on the address bar, etc. can be downloaded.
H5 mobile terminal download
H5
For the preview all of the above methods can be implemented, but the download operation is different, because this is to distinguish the scenarios:
- Based on mobile browser
- Based on WeChat built-in browser
The download method based on the mobile browser is roughly the same as the content mentioned above. In essence, as long as the client supports downloading, there is no problem. However, in the built-in browser of WeChat you may not be able to use the conventional download method. expected:
Android
Using the normal download method in , a dialog box will pop up, asking if you need to wake up the mobile browser to download the corresponding resources, but some models will not- None of the above methods can be downloaded
IOS
in , so usually a new one will be opened to provide a preview. Some models support the save operation by long pressing the screen in the new page , but not all models support it.webview
The essential reason is to block any download links in the WeChat built-in browser , such as APP download links , ordinary file download links , and so on.
How else can I download the H5 mobile terminal?
Since this is the shielding of the download function by the built-in browser environment of WeChat, there is no need to consider ( `I dare not even think about` ) to realize the download function based on the built-in browser of WeChat. Instead, we should consider how to realize the indirect download :
- Determine whether the current browser belongs to the built-in browser of WeChat , and if so, help the user to automatically wake up the mobile browser to download, but not all models support the wake up operation, so it is best to prompt the user to download directly through the mobile browser, for convenience Users can realize the function of one-key copy to assist
- The other is to directly prompt that only
PC
the terminal , and give up the download operation on the mobile terminal
at last
To sum up, there may be no way to achieve a perfect way in the actual pdf
preview process, especially for files like invoices , there are still the following problems:pdf
- There is no guarantee that all
h5
mobile terminals have the download function - There is no
pdf
guarantee that the previewed font will be consistent with the actual invoice font
Most of the existing preview methods are implemented based pdf.js
on the method, and the method pdf.js
is PDFJs.getDocument(url/buffer)
used to obtain content based on the file address or data stream, and then process and render the file through the method. If you are interested, you can study the source code .canvas
pdf
pdf.js
pdf.js
The related problem is that if the corresponding pdf
file contains pdf.js
a font that does not exist in , then it cannot be completely rendered, and the rendered font will be different from the original pdf
file font.
For these two points, it is found that Google's built-in pdf
plug-ins seem to provide good support, which means that if other browsers include Google-related plug-ins (such as: Edge, QQ Browser), they can directly <iframe>
implement preview based on the method, and Or for more stringent font consistency, you can only view the source file by downloading.
What if the product requirements cannot be met?
For example, the solutions discussed above cannot actually meet some of the requirements mentioned at the beginning of the article. The purpose of product requirements is also to provide a better user experience ( `under normal circumstances` ), but these requirements still need to be implemented in technology, and the degree of technical support needs our timely feedback (` Unless your product is technical experience` ), so as a developer, you need to provide sufficient content to prove to the product, and then give some indirect implementation solutions (
又或者产品自己就给出新的方案
) to see if it meets the second expectation . The core is reasonable communication + other solutions (每个人的处境不同,实际情况也许 ... 懂得都懂
).
The above are some personal opinions and understandings. If there is something inappropriate, you can correct it in the comment area! ! !
Hope this article helps you! ! !