CompletableFuture: proper way to run a list of futures, wait for result and handle exception

AnMi :

I have a legacy code which have dozen database calls to populate a report, it takes noticeable amount of time which I try to reduce using CompletableFuture.

I have some doubts that I do things correctly and not overuse this technology.

My code now looks like this:

  1. Start asynchronous population of document sections with many database calls inside each methods

    CompletableFuture section1Future = CompletableFuture.supplyAsync(() -> populateSection1(arguments));
    CompletableFuture section2Future = CompletableFuture.supplyAsync(() -> populateSection2(arguments));
        ...
    CompletableFuture section1oFuture = CompletableFuture.supplyAsync(() -> populateSection10(arguments));
    
  2. Then I'm arranging futures in specific order in arrayList and joining all of them to make sure that my code will run further only when all futures are finished.

    List<CompletableFuture> futures = Arrays.asList(
                section1Future,
                section2Future, ...
                section10Future);
    
    List<Object> futureResults = futures.stream()
                .map(CompletableFuture::join)
                .collect(Collectors.toList());
    
  3. Then I'm populating PDF document itself with its pieces

    Optional.ofNullable((PdfPTable) futureResults.get(0)).ifPresent(el -> populatePdfElement(document, el));
    Optional.ofNullable((PdfPTable) futureResults.get(1)).ifPresent(el -> populatePdfElement(document, el));
        ...
    Optional.ofNullable((PdfPTable) futureResults.get(10)).ifPresent(el -> populatePdfElement(document, el));
    

    return document

My concerns are:

1) Is it okay to create and instantiate many Completable Futures in such way? Order them in required sequence in arrayList, join them to make sure that they are all finished, and then get result by casting them into specific object?

2) Is it okay to run without specifying an executor service but to rely on common ForkJoinPool? However this code runs in web container, so probably in order to use JTA I need to use container provided thread pool executor via JNDI?

3) If this code is surrounded in try-catch I should be able to catch CompletionException in main thread, right? Or In order to do that I should declare each features like following:

CompletableFuture.supplyAsync(() -> populateSection1(arguments))
    .exceptionally (ex -> {
                    throw new RuntimeException(ex.getCause());
        });

4) Is it possible to overuse CompletableFutures so they become a performance bottleneck itself? Like many futures waits one executor to start running? How to avoid that? Use container provided executor service? If yes, could someone please point me to some best practice on how to correctly configure executor service taking to account processors and memory amount?

5) A memory impact. I read in parallel thread that there can be a problem with OOME as many object are created and garbage collected. Is there a best practice on how to calculate correct amount of memory required for application?

Holger :

The approach is not wrong in general, but there are things to improve.

Most notably, you should not use raw types, like CompletableFuture.

When populateSection… returns a PdfPTable, you should use use CompletableFuture<PdfPTable> consistently throughout the code.

I.e.

CompletableFuture<PdfPTable> section1Future = CompletableFuture.supplyAsync(()  -> populateSection1(arguments));
CompletableFuture<PdfPTable> section2Future = CompletableFuture.supplyAsync(()  -> populateSection2(arguments));
    ...
CompletableFuture<PdfPTable> section10Future = CompletableFuture.supplyAsync(() -> populateSection10(arguments));

even if these methods do not declare the return type you are assuming to be always returned at runtime, you should insert the type cast at this early stage:

CompletableFuture<PdfPTable> section1Future = CompletableFuture.supplyAsync(()  -> (PdfPTable)populateSection1(arguments));
CompletableFuture<PdfPTable> section2Future = CompletableFuture.supplyAsync(()  -> (PdfPTable)populateSection2(arguments));
    ...
CompletableFuture<PdfPTable> section10Future = CompletableFuture.supplyAsync(() -> (PdfPTable)populateSection10(arguments));

Then, you can use

Stream.of(section1Future, section2Future, ..., section10Future)
    .map(CompletableFuture::join)
    .filter(Objects::nonNull)
    .forEachOrdered(el -> populatePdfElement(document, el));

By not using raw types, you already get the desired result type and you can do the 3rd step’s operations, i.e. filtering and performing the final action, right in this stream operation.

If you still need the list, you may use

List<PdfPTable> results = Stream.of(section1Future, section2Future, ..., section10Future)
    .map(CompletableFuture::join)
    .filter(Objects::nonNull)
    .collect(Collectors.toList());

results.forEach(el -> populatePdfElement(document, el));

That said, the parallelism depends on the thread pool used for the operation (specified to supplyAsync). When you don’t specify an executor, you get the default Fork/Join pool used by parallel streams, so in this specific case, you get the same result much simpler as

List<PdfPTable> results = Stream.<Supplier<PdfPTable>>.of(
    ()  -> populateSection1(arguments),
    ()  -> populateSection2(arguments));
    ...
    () -> populateSection10(arguments)))
    .parallel()
    .map(Supplier::get)
    .filter(Objects::nonNull)
    .forEachOrdered(el -> populatePdfElement(document, el));

or

List<PdfPTable> results = Stream.<Supplier<PdfPTable>>.of(
    ()  -> populateSection1(arguments),
    ()  -> populateSection2(arguments));
    ...
    () -> populateSection10(arguments)))
    .parallel()
    .map(Supplier::get)
    .filter(Objects::nonNull)
    .collect(Collectors.toList());

results.forEach(el -> populatePdfElement(document, el));

While both variants ensure that populatePdfElement will be called in the right order and one at a time, only the latter will perform all calls from the initiating thread.

Regarding exception handling, you’ll get any exception thrown by a supplier wrapped in a CompletionException when you call CompletableFuture::join. Chaining something like .exceptionally (ex -> { throw new RuntimeException(ex.getCause()); }); makes no sense, the new RuntimeException will also be wrapped in a CompletionException when you call CompletableFuture::join.

In the Stream variant, you’ll get the exception without a wrapper. Since Supplier does not allow checked exceptions, only subtypes of RuntimeException or Error are possible.

The other questions are too broad for the Q&A.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=78293&siteId=1