Processing method of large file upload - slice upload

This article introduces the basic implementation of slice uploading, and some additional functions after slice uploading. The principle of slice uploading is relatively simple, and the code comments are relatively clear, so I won’t go into details. The following additional functions introduce the implementation principle and post it An improvement on the original code. If there is any mistake, I hope the boss can point it out, thank you very much.

Sliced ​​and uploaded

The principle of uploading slices is relatively simple, that is, slice after obtaining the file, arrange the parameters of each slice after slicing, and then request concurrently. The code is directly below.

HTML

<template>
  <div>
    <input type="file" @change="handleFileChange" />
    <el-button @click="handleUpload">上传</el-button>
  </div>
</template>

JavaScript

<script>
const SIZE = 10 * 1024 * 1024; // 切片大小

export default {
    
    
  data: () => ({
    
    
    // 存放文件信息
    container: {
    
    
      file: null
      hash: null
    }data: [] // 用于存放加工好的文件切片列表
    hashPercentage: 0 // 存放hash生成进度
  }),
  methods: {
    
    
    // 获取上传文件
    handleFileChange(e) {
    
    
      const [file] = e.target.files;
      if (!file) {
    
    
        this.container.file = null;
        return;
      }
      this.container.file = file;
    },
        
    // 生成文件切片
    createFileChunk(file, size = SIZE) {
    
    
     const fileChunkList = [];
      let cur = 0;
      while (cur < file.size) {
    
    
        fileChunkList.push({
    
     file: file.slice(cur, cur + size) });
        cur += size;
      }
      return fileChunkList;
    },
        
    // 生成文件hash    
    calculateHash(fileChunkList) {
    
    
      return new Promise(resolve => {
    
    
        this.container.worker = new Worker("/hash.js");
        this.container.worker.postMessage({
    
     fileChunkList });
        this.container.worker.onmessage = e => {
    
    
          const {
    
     percentage, hash } = e.data;
          // 可以用来显示进度条
          this.hashPercentage = percentage;
          if (hash) {
    
    
            resolve(hash);
          }
        };
      });
    },

   	// 切片加工(上传前预处理 为文件添加hash等)
    async handleUpload() {
    
    
      if (!this.container.file) return;
      // 切片生成
      const fileChunkList = this.createFileChunk(this.container.file);
      // hash生成
      this.container.hash = await this.calculateHash(fileChunkList);
      this.data = fileChunkList.map(({
     
      file },index) => ({
    
    
        chunk: file,
        // 这里的hash为文件名 + 切片序号,也可以用md5对文件进行加密获取唯一hash值来代替文件名
        hash: this.container.hash + "-" + index
      }));
      await this.uploadChunks();
    }
      
    // 上传切片
   	async uploadChunks() {
    
    
     const requestList = this.data
     	// 构造formData
       .map(({
     
      chunk,hash }) => {
    
    
         const formData = new FormData();
         formData.append("chunk", chunk);
         formData.append("hash", hash);
         formData.append("filename", this.container.file.name);
         return {
    
     formData };
       })
     	// 发送请求 上传切片
       .map(async ({
     
      formData }) =>
       	request(formData)
       );
     await Promise.all(requestList); // 等待全部切片上传完毕
     await merge(this.container.file.name) // 发送请求合并文件
   	},
  }
};
</script>

generate hash

Both the front-end and the server must generate the hash of the file and the slice. Before, we used the file name + slice subscript as the hash of the slice. Once the file name is modified, the effect will be lost. In fact, as long as the file content remains unchanged, The hash should not change, so the correct way is to generate hash according to the content of the file, so let's modify the hash generation rules

Another library spark-md5 is used here, which can calculate the hash value of the file according to the file content. In addition, if a large file is uploaded, it is very time-consuming to read the file content and calculate the hash, and it will cause UI blocking. This causes the page to freeze, so we use web-worker to calculate the hash in the worker thread, so that users can still interact normally on the main interface

When instantiating web-worker, the parameter is a js file path and cannot be cross-domain, so we create a hash.js file separately and put it in the public directory. In addition, access to dom is not allowed in the worker, but it provides importScripts `Function is used to import external script, import spark-md5 through it

// /public/hash.js
self.importScripts("/spark-md5.min.js"); // 导入脚本

// 生成文件 hash
self.onmessage = e => {
    
    
  const {
    
     fileChunkList } = e.data;
  const spark = new self.SparkMD5.ArrayBuffer();
  let percentage = 0;
  let count = 0;
  const loadNext = index => {
    
    
    // 新建读取器
    const reader = new FileReader();
    // 设定读取数据格式并开始读取
    reader.readAsArrayBuffer(fileChunkList[index].file);
    // 监听读取完成
    reader.onload = e => {
    
    
      count++;
      // 获取读取结果并交给spark计算hash
      spark.append(e.target.result);
      if (count === fileChunkList.length) {
    
    
        self.postMessage({
    
    
          percentage: 100,
          // 获取最终hash
          hash: spark.end()
        });
        self.close();
      } else {
    
    
        percentage += 100 / fileChunkList.length;
        self.postMessage({
    
    
          percentage
        });
        // 递归计算下一个切片
        loadNext(count);
      }
    };
  };
  loadNext(0);
};

Summarize

  1. Get uploaded files
  2. After the file is sliced, it is stored in the array fileChunkList.push({ file: file.slice(cur, cur + size) });
  3. Generate file hash (not required)
  4. Generate a list of requests from a list of file slices
  5. concurrent requests
  6. Send a merge request after all requests are complete

File transfer in seconds

In fact, it is a blindfold, used to deceive users.

Principle: Calculate the hash of the file before uploading the file, and then send it to the backend for verification to see if the hash exists in the backend. If it exists, it proves that the file has been uploaded, and then directly prompts the user that the second upload is successful

// 切片加工(上传前预处理 为文件添加hash等)
async handleUpload() {
    
    
  if (!this.container.file) return;
  // 切片生成
  const fileChunkList = this.createFileChunk(this.container.file);
  // hash生成
  this.container.hash = await this.calculateHash(fileChunkList);
    
  // hash验证 (verify为后端验证接口请求)
  const {
    
     haveExisetd } = await verify(this.container.hash)
  // 判断
  if(haveExisetd) {
    
    
  	this.$message.success("秒传:上传成功") 
    return   
  } 
   
  this.data = fileChunkList.map(({
     
      file },index) => ({
    
    
    chunk: file,
    // 这里的hash为文件名 + 切片序号,也可以用md5对文件进行加密获取唯一hash值来代替文件名
    hash: this.container.hash + "-" + index
  }));
  await this.uploadChunks();
}

pause upload

Principle: Store all the slices in an array, and remove them from the array every time a slice is uploaded, so that only the uploaded files can be saved in an array. In addition, because the upload is to be paused, an interrupt request is required. axiosThe interrupt request can be utilizedAbortController

Interrupt request example

const controller = new AbortController()

axios({
    
    
  signal: controller.signal
}).then(() => {
    
    });

// 取消请求
controller.abort()

Add pause upload function

// 上传切片
async uploadChunks() {
    
    
 // 需要把requestList放到全局,因为要通过操控requestList来实现中断
 this.requestList = this.data
 	// 构造formData
   .map(({
     
      chunk,hash }) => {
    
    
     const formData = new FormData();
     formData.append("chunk", chunk);
     formData.append("hash", hash);
     formData.append("filename", this.container.file.name);
     return {
    
     formData };
   })
 	// 发送请求 上传切片
   .map(async ({
     
      formData }, index) =>
     request(formData).then(() => {
    
    
       // 将请求成功的请求剥离出requestList
       this.requestList.splice(index, 1)
     })
   );
 await Promise.all(this.requestList); // 等待全部切片上传完毕
 await merge(this.container.file.name) // 发送请求合并文件
},
// 暂停上传   
handlePause() {
    
    
	this.requestList.forEach((req) => {
    
    
        // 为每个请求新建一个AbortController实例
     	const controller = new AbortController();
        req.signal = controller.signal
        controller.abort()
    })
}

resume upload

Principle: Before uploading slices, send a request to the background, the interface will return the list of uploaded slices, filter the existing slices in the background through the slice hash, and only upload slices that do not exist

// 切片加工(上传前预处理 为文件添加hash等)
async handleUpload() {
    
    
  if (!this.container.file) return;
  // 切片生成
  const fileChunkList = this.createFileChunk(this.container.file);
  // 文件hash生成
  this.container.hash = await this.calculateHash(fileChunkList);
    
  // hash验证 (verify为后端验证接口请求)
  const {
    
     haveExisetd, uploadedList } = await verify(this.container.hash)
  // 判断
  if(haveExisetd) {
    
    
  	this.$message.success("秒传:上传成功") 
    return   
  } 
   
  this.data = fileChunkList.map(({
     
      file },index) => ({
    
    
    chunk: file,
    // 注:这个是切片hash   这里的hash为文件名 + 切片序号,也可以用md5对文件进行加密获取唯一hash值来代替文件名
    hash: this.container.hash + "-" + index
  }));
  await this.uploadChunks(uploadedList);
}


// 上传切片
async uploadChunks(uploadedList = []) {
    
    
 // 需要把requestList放到全局,因为要通过操控requestList来实现中断
 this.requestList = this.data
    // 过滤出来未上传的切片
   .filter(({
     
      hash }) => !uploadedList.includes(hash))
 	// 构造formData
   .map(({
     
      chunk,hash }) => {
    
    
     const formData = new FormData();
     formData.append("chunk", chunk);
     formData.append("hash", hash);
     formData.append("filename", this.container.file.name);
     return {
    
     formData };
   })
 	// 发送请求 上传切片
   .map(async ({
     
      formData }, index) =>
     request(formData).then(() => {
    
    
       // 将请求成功的请求剥离出requestList
       this.requestList.splice(index, 1)
     })
   );
 await Promise.all(this.requestList); // 等待全部切片上传完毕
 // 合并之前添加一层验证 验证全部切片传送完毕
 if(uploadedList.length + this.requestList.length == this.data.length){
    
    
	await merge(this.container.file.name) // 发送请求合并文件
 }
},
    
// 暂停上传   
handlePause() {
    
    
	this.requestList.forEach((req) => {
    
    
        // 为每个请求新建一个AbortController实例
     	const controller = new AbortController();
        req.signal = controller.signal
        controller.abort()
    })
}

// 恢复上传
async handleRecovery() {
    
    
    //获取已上传切片列表 (verify为后端验证接口请求)
	const {
    
     uploadedList } = await verify(this.container.hash)
    await uploadChunks(uploadedList)
}

Add function summary

1. File transfer in seconds is actually a simple verification. The hash of the file is sent to the backend. The backend verifies whether the file exists and returns the result. If it exists, it prompts that the file is transferred in seconds.

2. Breakpoint transmission is divided into two steps, suspend upload and resume upload. Pausing the upload is achieved by obtaining the list of unuploaded slices (the complete slice list is formed after stripping the requested slices), and interrupting the list request. The essence of resuming upload is also a layer of verification. Before uploading the file, the hash of the file is sent to the backend, and the backend returns the list of slices that have been uploaded, and then filters out the slices in the slice list returned by the backend according to the slice hash. Upload slices that have not been uploaded yet.

Guess you like

Origin blog.csdn.net/m0_64023259/article/details/124318137