Is it safe to use parallelstream() to populate a Map in Java 8

OneMoreError :

I have a list of 1 million objects, and I need to populate that into a Map. Now, I want to reduce the time for populating this into a Map, and for this I am planning on using Java 8 parallelstream() like this:

List<Person> list = new LinkedList<>();
Map<String, String> map = new HashMap<>();
list.parallelStream().forEach(person ->{
    map.put(person.getName(), person.getAge());
});

I want to ask is it safe to populate a Map like this through parallel threads. Isn't it possible to have concurrency issues, and some data may get lost in the Map ?

Tunaki :

It is very safe to use parallelStream() to collect into a HashMap. However, it is not safe to use parallelStream(), forEach and a consumer adding things to a HashMap.

HashMap is not a synchronized class, and trying to put elements in it concurrently will not work properly. This is what forEach will do, it will invoke the given consumer, which puts elements into the HashMap, from multiple threads, possibly at the same time. If you want a simple code demonstrating the issue:

List<Integer> list = IntStream.range(0, 10000).boxed().collect(Collectors.toList());
Map<Integer, Integer> map = new HashMap<>();
list.parallelStream().forEach(i -> {
    map.put(i, i);
});
System.out.println(list.size());
System.out.println(map.size());

Make sure to run it a couple of times. There's a very good chance (the joy of concurrency) that the printed map size after the operation is not 10000, which is the size of the list, but slightly less.

The solution here, as always, is not to use forEach, but to use a mutable reduction approach with the collect method and the built-in toMap:

Map<Integer, Integer> map = list.parallelStream().collect(Collectors.toMap(i -> i, i -> i));

Use that line of code in the sample code above, and you can rest assured that the map size will always be 10000. The Stream API ensures that it is safe to collect into a non-thread safe container, even in parallel. Which also means that you don't need to use toConcurrentMap to be safe, this collector is needed if you specifically want a ConcurrentMap as result, not a general Map; but as far as thread safety is concerned with regard to collect, you can use both.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=451532&siteId=1