The front-end uses slicing to realize large file breakpoint resuming

http

During the file upload period, it is interrupted due to some things (such as network interruption, server error, client crash), but the same file can be uploaded next time from the last uploaded position to save upload time.

github-demo

Implementation ideas

  • Divide file uploads into multiple slices
  • upload slice
  • merge slices
  • File Slice Validation
  • state switching

1. How to slice the file at the front end

We need to generate a unique file fileHashfor the purpose of judging whether the file has a breakpoint before each upload. If the hash is the same and there is a breakpoint at the same time, then it is necessary to continue the upload. If the upload is successful, just upload it again.

Use the function Blobin the object sliceto slice the file. Files can be cut to a custom size and number.

Blob.slice

create a createChunlfunction

const createFileChunk = async (file, size = SIZE) => {
  const fileChunkList = [];
  let spark = new SparkMd5.ArrayBuffer();
  let readerComplete = false;
  const reader = new FileReader();

  for (let cur = 0; cur < file.size; cur += size) {
    let data = { file: file.slice(cur, cur + size) };
    fileChunkList.push(data);
    reader.readAsArrayBuffer(data.file);
    await new Promise((resolve) => {
      reader.onload = (e) => {
        spark.append(e.target.result);
        resolve();
      };
    });
  }
  fileHash.value = spark.end();
  return fileChunkList;
};

spark-md5It is an efficient md5 encryption algorithm. For details, please refer to the document spark-md5 . By spark-md5encrypting the file content into a hash value, it is used as the unique identifier for file upload.

create a uploadfunction

const handleUpload = async () => {
  if (!uploadFile.value) return;
  loading.value = true;
  const fileChunkList = await createFileChunk(uploadFile.value);
  let vertifyRes = await request({
    url: "/vertify",
    method: "post",
    data: {
      fileHash: fileHash.value,
      filename: fileHash.value + "." + getSuffix(uploadFile.value.name),
    },
  });
  if (!vertifyRes.data.shouldUpload) {
    alert(vertifyRes.msg);
    uploadPercentage.value = 100;
    loading.value = false;
    return;
  }
  data.value = fileChunkList.map(({ file }, index) => ({
    chunk: file,
    hashPrefix: fileHash.value,
    suffix: getSuffix(uploadFile.value.name),
    hash: fileHash.value + "-" + index,
    index: index,
    percentage: 0,
  }));
  await uploadChunk(vertifyRes.data.uploadList);
  loading.value = false;
};

To upload a file slice, you first need to verify whether the hash value already exists on the server. If it exists, the same file already exists. If the verification fails, it means that you don’t need to upload it again. Only when the verification passes, you can upload the file. Verify the interface The hash value of the slice will be included in the response data, in order to filter out the uploaded slices and improve the upload efficiency.

The rest of the front end only needs to filter the hash slices and upload the remaining slices

// 上传切片
const uploadChunk = async (uploadList = []) => {
  const requestList = data.value
    .filter(({ hash }) => !uploadList.includes(hash))
    .map(({ chunk, hash, index, hashPrefix }) => {
      const formData = new FormData();
      formData.append("chunk", chunk);
      formData.append("hash", hash);
      formData.append("filename", uploadFile.value.name);
      formData.append("index", index);
      formData.append("hashPrefix", hashPrefix);
      return { formData, index };
    })
    .map(async ({ formData, index }) => {
      return request({
        url: "/upload_chunk",
        data: formData,
        onUploadProgress: onPregress.bind(null, data.value[index]),
        cancelToken: new axios.CancelToken((c) => {
          aborts.value.push(c);
        }),
        headers: {
          "Content-Type": "multipart/form-data",
        },
        method: "post",
      });
    });
  showStopAndResume.value = true;
  await Promise.all(requestList);
  // 合并切片
  await mergeRequest();
  aborts.value = [];
};

The point here mergeRequestis to tell the backend to merge the slices after all the slices are uploaded successfully.

const mergeRequest = async () => {
  let res = await request({
    url: "/merge_chunk",
    headers: {
      "Content-Type": "application/json",
    },
    method: "post",
    data: {
      originName: uploadFile.value.name,
      filename: fileHash.value + "." + getSuffix(uploadFile.value.name),
      size: SIZE,
    },
  });
};

In this way, some core codes of the front end are basically completed.

2. How to upload files in slices on the backend

The backend is temporarily expressimplemented using

According to the requirements of the front end, the following interfaces need to be implemented

  • /upload_chunk
  • /merge_chunk
  • /vertify

Implementation ideas

1./upload_chunk

Use fsthe relevant API to move the uploaded cache chunk resource to the corresponding directory

const multipart = new multiparty.Form({
    maxFieldsSize: 200 * 1024 * 1024,
  });
  multipart.parse(req, async (err, fields, files) => {
    if (err) {
      res.send({
        code: 500,
        msg: "服务器错误",
      });
      return;
    }
    const [chunk] = files.chunk;
    const [hash] = fields.hash;
    const [hashPrefix] = fields.hashPrefix;
    const chunkDir = path.resolve(UPLOAD_DIR, hashPrefix);
    if (!fse.existsSync(chunkDir)) {
      await fse.mkdirs(chunkDir);
    }
    await fse.move(chunk.path, `${chunkDir}/${hash}`);
    res.send({
      code: 200,
      msg: "上传成功",
    });
  });

2./merge_chunk

Merge the uploaded ones in order chunkaccording to hash-indexthe serial number, and use the core api isfs.createReadStream()

// 合并切片
const mergeFileChunk = async (filePath, filename, size) => {
  const chunkDir = path.resolve(
    UPLOAD_DIR,
    filename.slice(0, filename.lastIndexOf("."))
  );
  const chunkPaths = await fse.readdir(chunkDir);
  // 根据切片下标进行排序
  // 否则直接读取目录的获取的顺序可能会错乱
  chunkPaths.sort((a, b) => a.split("-")[1] - b.split("-")[1]);
  let pipeP = chunkPaths.map((chunkPath, index) =>
    pipeStream(
      path.resolve(chunkDir, chunkPath),
      fse.createWriteStream(filePath, {
        start: index * size,
        end: (index + 1) * size,
      })
    )
  );
  await Promise.all(pipeP);
  fse.rmdirSync(chunkDir); // 合并后删除保存的切片目录
};
const pipeStream = (path, writeStream) => {
  return new Promise((resolve) => {
    const readStream = fse.createReadStream(path);
    readStream.on("end", () => {
      fse.unlinkSync(path);
      resolve();
    });
    readStream.pipe(writeStream);
  });
};
app.post("/merge_chunk", async (req, res) => {
  //   size是每一个chunk的size
  const { filename, size } = req.body;
  const filePath = path.resolve(UPLOAD_DIR, `${filename}`);
  await mergeFileChunk(filePath, filename, size);
  res.send({
    code: 200,
    msg: "file merged success",
  });
});

Create chunkthe read-in stream of the slice path, and then write the read-in stream to a new address. When all the slices are written, the complete file will be merged successfully.

3./vertify

This API is relatively simple. It is used to check whether the current file exists in the server resource directory. If it exists, it returns shouldUploadfalse. If it does not exist, you need to get the slice array uploaded by this file to filter the slices that do not need to be uploaded by the front end.

const createUploadedList = async (fileHash) =>
  fse.existsSync(path.resolve(UPLOAD_DIR, fileHash))
    ? await fse.readdir(path.resolve(UPLOAD_DIR, fileHash))
    : [];

app.post("/vertify", async (req, res) => {
  const { filename, fileHash } = req.body;
  const filePath = path.resolve(UPLOAD_DIR, filename);
  if (fse.existsSync(filePath)) {
    res.send({
      code: 200,
      data: {
        shouldUpload: false,
      },
      msg: "文件已上传成功",
    });
    return;
  }
  res.send({
    code: 200,
    data: {
      shouldUpload: true,
      uploadList: await createUploadedList(fileHash),
    },
    msg: "请求成功",
  });
});

At this point, the basic work of uploading large files has been completed. The core idea is the idea of ​​slicing. Large files are uniquely identified + uploaded as slices, and the server first caches the slices before merging them. It ensures that the breakpoint can be resumed, and large files can be uploaded stably.

Guess you like

Origin juejin.im/post/7233613362888966205