Use webflux to improve data export efficiency

sequence

This article mainly studies how to use webflux to improve data export efficiency

traditional export

    @GetMapping("/download-old")
    public ResponseEntity<Resource> downloadInOldWays(){
        return ResponseEntity.ok()
                .header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=demo.xls")
                .header("Accept-Ranges", "bytes")
                .body(new ByteArrayResource(exportBytes(1000)));
    }
    
    public byte[] exportBytes(int dataRow){
        StringBuilder output = new StringBuilder();
        output.append(ExcelUtil.startWorkbook());
        output.append(ExcelUtil.startSheet());
        output.append(ExcelUtil.startTable());
        output.append(ExcelUtil.writeTitleRow(Sets.newHashSet("title","content")));
        IntStream.rangeClosed(1,dataRow).forEach(i -> {
            try {
                TimeUnit.MILLISECONDS.sleep(100);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            output.append(ExcelUtil.writeDataRow(Lists.newArrayList("title"+i,"content"+i)));
        });
        output.append(ExcelUtil.endTable());
        output.append(ExcelUtil.endSheet());
        output.append(ExcelUtil.endWorkbook());
        return output.toString().getBytes(StandardCharsets.UTF_8);
    }

The simulation here is to wait for all the data to be ready before exporting. This speed is definitely slow. It takes almost 100 seconds for the browser to pop up the download box. If there is a gateway in front, it is easy to time out at the gateway.

webflux export

    @GetMapping("/download")
    public Mono<Void> downloadByWriteWith(ServerHttpResponse response) throws IOException {
        response.getHeaders().set(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=demo.xls");
        response.getHeaders().add("Accept-Ranges", "bytes");
        Flux<DataBuffer> flux = excelService.export(1000);
        return response.writeWith(flux);
    }

    public Flux<DataBuffer> export(int dataRow){
        return Flux.create(sink -> {
            sink.next(stringBuffer(ExcelUtil.startWorkbook()));
            sink.next(stringBuffer(ExcelUtil.startSheet()));
            sink.next(stringBuffer(ExcelUtil.startTable()));

            //write title row
            sink.next(stringBuffer(ExcelUtil.writeTitleRow(Sets.newHashSet("title","content"))));
            //write data row
            IntStream.rangeClosed(1,dataRow).forEach(i -> {
                try {
                    TimeUnit.MILLISECONDS.sleep(100);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                sink.next(stringBuffer(ExcelUtil.writeDataRow(Lists.newArrayList("title"+i,"content"+i))));
            });

            sink.next(stringBuffer(ExcelUtil.endTable()));
            sink.next(stringBuffer(ExcelUtil.endSheet()));
            sink.next(stringBuffer(ExcelUtil.endWorkbook()));

            sink.complete();
        });
    }

Here, the writeWith(Publisher<? extends DataBuffer> body) method of ReactiveHttpOutputMessage is used to realize the export while preparing the data and wait for more than ten seconds to pop up the download box. After that, the server side outputs while the browser downloads, and the download is completed in about 100 seconds.

summary

Both methods currently appear to take about the same time, although the latter avoids timeouts. Of course, a similar effect can be achieved using traditional mvc, that is, the output stream of the response is continuously written and flushed. However, webflux can cooperate with reactive repository to realize end-to-end reactive stream and avoid OOM.

Use webflux to improve data export efficiency

sequence

traditional export

webflux export

summary

Guess you like