Java SE foundation consolidation (IV): collections

1 Collections Overview

There are many Java collection class, e.g. ArrayList, LinkedList, HashMap, TreeMap like. Feature set is to accommodate multiple objects, they are like containers, like (in fact, there is no direct vessel called problems, C ++ is called it), when required, may come from the inside, very convenient. In Java5 provides a generic mechanism after the container has the ability to do type checking at compile time, use more safe and convenient.

Java Collection classes are mainly derived from the two interfaces: Collection and Map, as follows:

i3JEbd.png

i3JZVA.png

Collection and mainly Set, Queue, List three interfaces, on this basis, there are multiple implementation classes. Has a total number of the same class that implements the Map interface, e.g. HashMap, EnumMap, HashTable like. Next I will select a few common set of classes to discuss specific discussion.

2 ArrayList和LinkedList

List collection is arguably the most popular collection, is also commonly used than the HashMap, generally we write the code as soon as circumstances need to store multiple elements of experience, I think the use of priority List collection, as is the use of ArrayList or LinkedList implement implementation class class, then the specific conditions.

2.1 ArrayList

ArrayList implements List interface, abstract class inherits AbstractList, AbstractList abstract class implements the abstract methods defined in the vast majority of the List interface, so we can not achieve seen most abstract methods of the List interface defined in the ArrayList source. ArrayList using internal array to store the object, which is the origin of the name ArrayList which various operations such as GET, etc. are add array-based operation, the following is the source of the add method:

public void add(int index, E element) {
    //先检查index是否在一个合理的范围内
    rangeCheckForAdd(index);
	
    //保证数组的容量足够加入新的元素,发现不足够的话会进行扩容操作
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    //进行一次数组拷贝,这里的elementData就是保存对象的Object数组
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    //往数组中加入元素
    elementData[index] = element;
    //修改size大小
    size++;
}
复制代码

Explanation is given in the notes, get method is also very simple, do not waste time.

2.2 LinkedList

LinkedList class inherits AbstractSequentialList, AbstractSequentialList class also inherited the AbstractList class, but of course there are also LinkedList implement the List interface, but also to achieve the Deque interface which is more interesting, and shows just LinkedList List, or a Queue. The following figure shows its successor system:

i3JrqJ.png

LinkedList implement the List-based chain, which is the biggest difference and ArrayList. LinkedList a Node inner class used to represent nodes as follows:

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}
复制代码

This class has the next pointer and prev pointer, showing a doubly linked list. Then take a look LinkedList add the following:

public void add(int index, E element) {
    //检查index
    checkPositionIndex(index);
	
    //如果index和size相等,说明已经到最后了,直接在last节点后插入节点接口
    if (index == size)
        linkLast(element);
    else //否则就在index位置插入节点
        linkBefore(element, node(index));
}
复制代码

In linkLast () and linkBefore () method in involve the operation of the list, which achieved LinkLast () is relatively simple, linkBefore () is slightly more complicated, but as long studied the data structure of a friend, to see these source code should be no problem, this does not put the source code, source code comparison article positioning is not resolved.

The difference between 2.3 ArrayList and LinkedList

In the previous presentation I spoke of fact, sum up here:

  1. ArayyList is based on an array of implementation, LinkedList is implemented based on the list, because different implementations, there is definitely a difference between their efficiency, high efficiency random access ArrayList, but the insert operation will involve an array copy, so the efficiency is the efficiency of insertion not tall. LinkedList insertion efficiency can be high or low, if the tail is inserted last because there is a node, the insertion tail very fast, but the efficiency is not high in the insertion position of the other, for random access, because the need to start from scratch traversing nodes, random access efficiency is not high.
  2. Their inheritance hierarchies differ slightly, LinkedList also implements the Deque interface, which is more features.

3 SynchronizedList和Vector

The reason why the two of them together because they are made thread-safe list collection. SynchronizedList is the Collections utility class in an internal static class that implements List interface, inherited SynchronizedCollection class, Vector is a synchronized List JDK early, and inheritance system ArrayList exactly the same, but is also based on the array to achieve, but his various methods are synchronized.

3.1 SynchronizedList

SynchronizedList Collections class is a class package-level private static inner classes, we can not call this class directly in the preparation of the code, only through Collection.synchronizedList () method and passing in a List to use it, this method is actually we had no help of synchronous measures became common packaging List synchronizedList, its characteristics have thread-safe, its operation is to operate on the original List, as follows:

public void add(int index, E element) {
    synchronized (mutex) {list.add(index, element);}
}
复制代码

3.2 Vector

Vector JDK1.0 class is there, be regarded as a class of ancient times, at the time because there is no better synchronization tool, so in concurrency scenarios will be used to this class, but now with the advancement of technology, concurrent, with the better synchronization tools, so Vector has quickly become a semi-abandoned state. why? Mainly because of low efficiency of synchronization, the synchronization means too rude, brutal methods to direct the vast majority ended up synchronization method (adding synchronized keyword on the method), and even clone method did not let go:

public synchronized Object clone() {
    try {
        @SuppressWarnings("unchecked")
            Vector<E> v = (Vector<E>) super.clone();
        v.elementData = Arrays.copyOf(elementData, elementCount);
        v.modCount = 0;
        return v;
    } catch (CloneNotSupportedException e) {
        // this shouldn't happen, since we are Cloneable
        throw new InternalError(e);
    }
}
复制代码

While doing so can be thread-safe, but the efficiency is too low, ah, especially in a highly competitive environment, efficiency may not be as single-threaded. In contrast, SynchronizedList like a lot, except where necessary for locking it (but in fact, efficiency is still very low). Based on its CRUD operations is not to say, and no big difference ArrayList.

The difference SynchronizedList and Vector

Their biggest difference is in sync synchronization means efficiency, Vector is too rough so that efficiency is too low, synchronization means SynchronizedList not so rude, just synchronize it where necessary, and more efficient than Vector will be better, but in fact is not It will be very good, relatively synchronization means relatively simple, but only with the built-in lock a scheme.

4 HashMap、HashTable和ConcurrentHashMap

When we want to store key-value pairs or want to express some kind of mapping relationships, this class will be used HashMap, HashTable is synchronized version of HashMap is thread-safe, but inefficient, ConcurrentHashMap is JDK1. Alternatively HashTable after 5 classes, high efficiency, so it is generally not used in a concurrent environment HashTable, but the use of ConcurrentHashMap.

By the way, ConcurrentHashMap under JUC package, lead author of the package is Doug Lea, the big brother who propped up almost a concurrent Java technology.

4.1 HashMap

HashMap internal structure is an array (referred to as a table) + list (reaches a threshold value are converted to red-black tree) form. Elements in the array are stored in a linked list and Node, as follows:

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;

    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    public final K getKey()        { return key; }
    public final V getValue()      { return value; }
    public final String toString() { return key + "=" + value; }

    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}
复制代码

When inserting the key-value pairs HashMap time, will acquire a key for the hash function to obtain a hashCode, and then to determine whether the key-value pair (actually the final form Node) at which position table should be placed according to hashCode, this process If there Hash collision, i.e., the position of the table have node a node, then the new key will be inserted into the head of the linked list node a to node (the interpolation tail, in JDK1.8 instead head interpolation), if you encounter the same situation in a key way to traverse the list, and then replaced with a new value directly to the value of the original value, this will no longer create a new Node, and if there is no way met, then it is created at the end of a node node, and insert it into the end of the list.

For more content HashMap, for example, what is the problem caused by the concurrent expansion and the influence expansion factor on performance and so on, it is recommended online search, so very, very many online articles, and more open to a community, all TM is HashMap article .....

4.2 HashTable

HashTable and HashMap of algorithms and there is not much difference, can be simply understood as the HashTable thread-safe version of HashMap, HashTable achieve thread-safe means is very brutal, and Vector almost the same, the vast majority of the direct method is set to sync method ,As follows:

public synchronized boolean contains(Object value) {
    if (value == null) {
        throw new NullPointerException();
    }

    Entry<?,?> tab[] = table;
    for (int i = tab.length ; i-- > 0 ;) {
        for (Entry<?,?> e = tab[i] ; e != null ; e = e.next) {
            if (e.value.equals(value)) {
                return true;
            }
        }
    }
    return false;
}
复制代码

So, its efficiency can be said to be very low, rarely used, but instead of using ConcurrentHashMap to be mentioned next.

4.3 ConcurrentHashMap

This class is located in java.util.concurrent (referred JUC) under the package, it is an excellent tool for concurrent class. ConcurrentHashMap internal storage structure of elements and HashMap almost the same, both arrays + list (up to the threshold are converted to red-black tree) structure. The difference is, CouncurrentHashMap is thread-safe, but it does not join in on each method synchronized lock built as rude as HashTable. Instead of using a technique called "segment lock" technique, the entire table array is divided into a plurality of segments, each segment has different locks, each segment can affect their where no effect on other segments, That, in a concurrent environment, multiple threads simultaneously on different segments ConcurrentHashMap operate. The effect is increased throughput, efficiency is much higher than HashTable, but the trouble is that some global variables to ensure the consistency is not very good, such as size.

For more content ConcurrentHashMap, it is recommended to find information on their own, there are many online analysis ConcurrentHashMap excellent article.

4.4 HashMap difference, HashTable and the ConcurrentHashMap

In fact, the above have been relatively few sections, we summarize here:

  1. HashMap HashTable is synchronized version, but since the synchronization means too rude, inefficient, ConcurrentHashMap appear after JDK1.5, HashTable is an alternative to class, before that, if you want to ensure thread safety of HashMap, HashTable either use or use Collections.synchronizedMap to wrap HashMap, but the efficiency of the two programs are relatively low.
  2. They almost the same as the three implementations, internal storage structure does not make any difference.
  3. HashTable almost in a semi-abandoned state, is not recommended for use in new projects, recommended ConcurrentHashMap.

Stream 5 Java8 enhancement of the collection classes

In addition to Java8 lambda expressions, the biggest feature is the Stream flow. Stream API set may be regarded as stream, the stream considered as elements of a set of elements one by one. Such operations can abstract collection will become very simple, clear, for example, in order to merge two sets in the past, you had to do to create a new collection, the collection will then traverse the two elements into a new collection but if stream API is very simple, just to be seen as two sets of two streams, directly to the two streams into one stream can be.

Stream API also provides many examples of higher-order functions for manipulating the flow element flows map, reduce, filter, etc., the following is a use of the Stream API:

public void streamTest() {
    Random random = new Random();
    List<Integer> integers = IntStream.generate(() -> random.nextInt(100))
            .limit(100).boxed()
            .collect(Collectors.toList());

    integers.stream().map(num -> num * 10)
            .filter(num -> num % 2 == 0)
            .forEach(System.out::println);
}
复制代码

In this few lines of code, in fact, it can only be three lines of code, to achieve a randomly generated elements into the list, and made a map operation and filter operation, incidentally, also traverse a bit List. If the previous method to use, you have to write:

public void originTest() {
    Random random = new Random();
    List<Integer> integers = new ArrayList<>();
    for (int i = 0; i < 100; i++) {
        integers.add(random.nextInt(100));
    }
    for (int i = 0; i < 100; i++) {
        integers.set(i, integers.get(i) * 10); //map
    }
    for (int i = 0; i < 100; i++) {
        if (integers.get(i) % 2 == 0)  //filter
            System.out.println(integers.get(i)); //foreach
    }
}
复制代码

Three for loop looks really ugly. This is the advantage of Stream API, simple, convenient, high degree of abstraction, but readability will be worse, if for the first time and see for lambda Stream unfamiliar people may be more strenuous (but in fact, it is very simple code).

That is not the future of the set of operations are using the Stream API it? Do not be so extreme, Stream API really simple, but very poor readability, Debug difficulty is very high, more often rely on human flesh Debug, and may be lower than the traditional method of performance, there may be high, so my advice They are: before use, and finally first test will compare the two schemes, the final selection of a good program based on the test results.

6 Summary

The collection is Java developers must be mastered by reading the source code to explain their understanding more profound than to see the article. This article simply speak several popular collections, there are many others such as TreeMap, HashSet, TreeSet, Stack are not involved, not to say that these collections is not important, but space restrictions, not one by one, hope readers can take a serious look at the source of these classes, see the source code does not require the time from start to finish, a few common ways to look at, for example, get, put and so on, then go step by step now, you can use the debugger to step tracking code.

Guess you like

Origin juejin.im/post/5d6e80905188250d9432b467
Recommended