【Java】--File upload/download and storage cases

I. Introduction

Recently I encountered a requirement to upload/download files and store them. In the process of completing the development of corresponding requirements, I encountered some thinking problems.
For file uploads, there are two main types that we usually encounter, one is uploading prepared files, and the other is downloading files from a network address url. At the same time, files also have sizes. If a large file is uploaded, network transmission may take a long time and be interrupted midway.
The whole case has the following thoughts:
Scenario :
(1) Upload network file URL;
(2) Upload file import method;
Thinking :
(1) How to upload large files?
(2) How to manage the uploaded file? For example, file type, file size, number of pages, etc.

1. Solution ideas

Regarding the network file URL, what is the solution for loading this large file?
Currently popular online solutions include:
(a) Downloading files in pieces, merging them, and then doing business processing
(b) Temporarily loading files to disk, and then doing business processing

----(a) Method is to establish a connection each time and skip certain bytes as required. Each thread only obtains the required bytes and saves the small file locally, and then the main thread splices it.
[ Disadvantage : Each thread here connects to the network URL and only obtains the required bytes. Each time you connect to the URL, it takes time. It's not ideal. The file itself supports downloading in parts and then splicing them together.

How to manage this file? For example, file type, file size, number of pages, etc.
You can use some components such as itextpdf to parse different types of files to obtain corresponding information and save them.

How to search a large file, and how to locate the search content to that page?
For example, if it is a PDF file, use itextpdf to paginate the Pdf file and convert the stream into text information, so that the text information is stored in ES for easy search. [ To be implemented in code... ]

What middleware is used to store files to facilitate downloading?
This plan uses mongodb storage.

For example, can documents such as pdf/word be searched for text?
[Thinking about solutions to this problem later...]

2. File upload/download code implementation

pom.xml dependency

      <!--用于操作Office系列-->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>3.8</version>
        </dependency>
        <!--用于操作Word-->
        <!-- https://mvnrepository.com/artifact/net.sf.jacob-project/jacob -->
        <dependency>
            <groupId>net.sf.jacob-project</groupId>
            <artifactId>jacob</artifactId>
            <version>1.14.3</version>
        </dependency>
        <!--用于操作PDF-->
        <!-- https://mvnrepository.com/artifact/com.lowagie/itext -->
        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>itextpdf</artifactId>
            <version>5.5.13.2</version>
        </dependency>

properties file configuration, if the file is too large, an error will be reported through the import method. Since Tomcat's default capacity is insufficient, a 200MB file can be used here.

#设置tomcat上传文件的大小
spring.servlet.multipart.max-file-size=200MB
spring.servlet.multipart.max-request-size=200MB

1. Upload network file url

1.1. Download the entire file to the local temporary directory of the service

    @RequestMapping("/uploadFileUrl.do")
    @ResponseBody
    public Object uploadFileUrl(){
   
    
    
        String filePath = "https://*******/解决方法.pdf";
        String targetPath = this.getClass().getClassLoader().getResource("").getPath();
        targetPath = targetPath.substring(1);
        return operatorFileService.uploadFileUrl(filePath,targetPath,"解决方法.pdf");
    }
    public String uploadFileUrl(String url,String targetPath,String fileName) {
   
    
    
        long begin_time = new Date().getTime();

        RestTemplate restTemplate = new RestTemplate();
        HttpHeaders headers = new HttpHeaders();
        MediaType type = MediaType.parseMediaType("application/json; charset=UTF-8");
        headers.setContentType(type);
    //    headers.add("Authorization", "token");
        HttpEntity<Resource> httpEntity = new HttpEntity<>(headers);
        /**
         * 缺点:如果文件很大的话,下载会比较慢,如何多线程进行下载???
         * todo 思考
         */
        ResponseEntity<byte[]> byRes = restTemplate.exchange(url, HttpMethod.GET,httpEntity, byte[].class);
        //得到数组
        byte[] body =  byRes.getBody();
        String pathUrl = saveFileToPath(body,targetPath,fileName);
        long end_time = new Date().getTime();
        long seconds = (end_time - begin_time);
        log.info("uploadFileUrl seconds:{}",seconds);
        return pathUrl

Guess you like

Origin blog.csdn.net/xunmengyou1990/article/details/131031877