Java Collections Framework of the most frequently asked questions of the analysis of the interview

Talk List Set difference,, Map of the three?

  • List (in order to deal with a good helper):  List stores a set of interfaces is not unique (there can be multiple elements reference the same object), ordered objects
  • Set (pay attention to the unique nature):  do not allow duplicate collections. There will be no more elements refer to the same object.
  • Map (expert with Key to search for):  Use to store key-value pairs. Map will maintain and Key associated value. Key two can reference the same object, but can not repeat Key, typically of type String Key, but may be any object.

Arraylist and LinkedList difference?

  • 1. Is the guarantee thread safety: ArrayList  and  LinkedList are not synchronized, that is not guaranteed to be thread safe;

  • 2. The underlying data structure: Arraylist  the bottom layer using  Object an array ; LinkedList bottom using a  doubly linked list  data structure (as before JDK1.6 circular linked list, JDK1.7 canceled and the circular doubly linked list bidirectional attention distinction circular linked list, the below described. !)

  • 3. Insert and delete elements if the affected location:  ①  ArrayList time using an array of storage, so the insert and delete elements of the complex elements influence the position of the subject.  For example: the implementation add(E e) of methods when,  ArrayList by default at the end of the specified element added to this list, in this case time complexity is O (1). But if i want to insert and delete elements at the specified location, then ( add(int index, E element) ) on the time complexity is O (ni). Since when performing the above operation in the first set after the i-th element and i (ni) elements have to perform bit manipulation backward / forward direction of a. ②  LinkedList linked list memory, so the add(�E e)method of inserting, deleting elements time complexity degree of influence from the element position, approximately O (1), if you want to specify the location of ithe element insertion and deletion of words ( (add(int index, E element)) time complexity approximate o(n))because of the need to to a specific location and then inserted.

  • 4. supports fast random access: LinkedList  do not support efficient random element access, and  ArrayList support. Quick access to fast random access is via the object element is an element number (corresponding to the get(int index) method).

  • 5. Memory footprint:  wasted space ArrayList is mainly reflected in the list at the end of the list will reserve some space capacity, cost and space LinkedList is reflected in each of its elements need to consume more space than ArrayList (because to store the direct and immediate predecessor and successor data).

 

Supplement: RandomAccess Interface

public interface RandomAccess {
}

View source we found that actually  RandomAccess interface nothing is defined. So, in my opinion  RandomAccess the interface is just a logo Bale. What identity? Logo implement this interface with random access capabilities.

In  binarySearch() method, it is to determine whether the incoming list  RamdomAccess instance, if so, call the indexedBinarySearch()method, if not, then call the iteratorBinarySearch()method

    public static <T>
    int binarySearch(List<? extends Comparable<? super T>> list, T key) {
        if (list instanceof RandomAccess || list.size()<BINARYSEARCH_THRESHOLD)
            return Collections.indexedBinarySearch(list, key);
        else
            return Collections.iteratorBinarySearch(list, key);
    }

ArrayList Implements  RandomAccess an interface, but  LinkedList no implementation. why? I think it is related to the structure and underlying data! ArrayList The bottom layer is an array, and  LinkedList the bottom is a linked list. An array of natural support random access, time complexity is O (1), so called fast random access. Need to traverse the list to a specific location to access the elements of a specific location, time complexity is O (n), it does not support fast random access. , ArrayList Implements  RandomAccess the interface, as shown by his fast random access. RandomAccess Simply identifies the interface, not to say that  ArrayList realize  RandomAccess the interface that has the fast random access function!

Here then summarize the list traversal mode selection:

  • Implements  RandomAccess list interface preference for general circulation, followed foreach,
  • Unrealized  RandomAccessinterface list, preference iterator traversal (foreach to traverse through the underlying iterator is implemented), the large size of the data, do not use ordinary for loop

Supplement: two-way and two-way linked list circular list

Doubly linked lists:  contains two pointers, prev points a previous node, a pointer to a next succeeding node.

Bidirectional circular list:  the last node point next head, and the head points to the last node prev, to form a ring.

uploading.4e448015.gif

The difference between Vector and ArrayList it? Why Arraylist replaced Vector it?

 

VectorAll methods of the class are synchronized. A Vector object can be accessed by two threads safely, but one thread to access the code words of Vector to spend a lot of time on the synchronous operation.

ArraylistNot synchronized, it is recommended that when not needed to ensure the safe use of thread Arraylist.

The difference between HashMap and Hashtable

  1. Is it safe to thread:  HashMap is not thread-safe, HashTable are thread-safe; internal HashTable basic methods have been synchronized modified. (If you want to ensure thread-safe, then use ConcurrentHashMap it!);
  2. Efficiency:  Because the thread safety problems, HashMap HashTable than a little high efficiency. Further, the HashTable basically been eliminated, it is not used in the code;
  3. Support for a Null value and Null key:  the HashMap in, as null key, only one such key, there may be one or more keys corresponding to the value of null. . But put the HashTable into the keys as long as there is a null, direct throw NullPointerException.
  4. The initial capacity and each capacity expansion size different sizes:  ① When creating an initial value if the capacity is not specified, Hashtable default initial size is 11, after each expansion, the capacity of the original becomes 2n + 1. HashMap default initialization size is 16. After each expansion, capacity becomes 2 times the original. When ② create if given the capacity of the initial value, Hashtable will direct you to use a given size, and HashMap will expand its size is a power of 2 (HashMap in tableSizeFor()approach guarantees, the source code is given below). That HashMap always use a power of 2 as the size of the hash table, it will be introduced later to why it is a power of two.
  5. Underlying data structure:  JDK1.8 after HashMap has made significant changes in resolving hash collision, when the chain length is greater than the threshold value (the default is 8), into a red-black tree list to reduce the search time. Hashtable is no such mechanism.

With an initial capacity of the HashMap constructor:

    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
     public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

This method ensures that the following HashMap always use a power of 2 as the size of the hash table.

    /**
     * Returns a power of two size for the given target capacity.
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

HashMap and HashSet difference

If you look at  HashSet the source code, then you should know: HashSet is based on the underlying HashMap implementation. (HashSet source very, very small, because in addition  clone() , writeObject(), readObject()is outside the HashSet themselves having to implement, other methods are methods in HashMap called directly.

HashMap HashSet
Implements Map interface Implements the Set interface
Store key-value pairs Only storage objects
Call to  put()add elements to the map Call the  add()method to add elements to the Set
Use HashMap key (Key) is calculated Hashcode HashSet member objects to calculate the hashcode value, hashcode may be the same for the two objects, it equals () method for the determination target equality,

How to check for duplicate HashSet

When you join an object HashSetwhen, HashSet will first calculate the object's hashcodevalue to determine the position of the object added, will also be compared with other hashcode value added objects, if there is no match of hashcode, HashSet assumes that the object is not repeated. But if the object has the same hashcode value found, then it calls the equals()method to check whether the object really equal hashcode same. If they are the same, HashSet will not let join a successful operation. (Excerpt from my book Java enlightenment "Head fist java" second edition)

hashCode () and equals () the relevant provisions:

  1. If two objects are equal, it must also be the same hashcode
  2. Two objects are equal, returns true equals two methods
  3. Hashcode two objects have the same value, they are not necessarily equal
  4. In summary, equals overridden the method, the method must be covered hashCode
  5. The default behavior hashCode () is to create a unique value to objects on the heap. If no override hashCode (), then the two objects are not equal class anyway (even if these two objects point to the same data).

== and equals the difference

  1. == two variables are determined or examples are not directed to the same memory space equals the value of two variables are determined or examples points to the memory space is not the same
  2. == refers to the memory address comparing equals () is compared to the contents of the string
  3. Are the same references refer == equals () refers to the values ​​are identical

The realization of the underlying HashMap

Before JDK1.8

JDK1.8 before  HashMap the bottom is  an array and linked lists  used together is  chain hash . HashMap obtained by hashCode key elapsed after the disturbance function processing hash value, and then by (n - 1) & hash determine the position of the current element stored (n here refers to the length of the array), currently exists element position, then it is determined hash value of the element to be stored as well as key elements are the same, if the same words, direct coverage, is not the same method to resolve conflicts through the zipper.

The so-called perturbation function refers to the hash method is the HashMap. The method using hash function is to prevent disturbance of the relatively poor after some implementations hashCode () method using the perturbation function words can reduce collisions.

JDK 1.8 HashMap method of hash Source:

The hash JDK 1.8 JDK 1.7 hash method compared to the more simplified method, but the same principle.

    static final int hash(Object key) {
      int h;
      // key.hashCode():返回散列值也就是hashcode
      // ^ :按位异或
      // >>>:无符号右移,忽略符号位,空位都以0补齐
      return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
  }

Compare the hash-source method of HashMap of JDK1.7.

static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).

    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

Compared to the hash method JDK1.8, the performance of hash method of JDK 1.7 will be a little bit weak, because, after all disturbed four times.

The so-called  "zipper method"  is: will list and array combined. That creates a linked list array, each cell in the array is a linked list. If you encounter hash collision, value will be added to the list of conflicts can be.

After JDK1.8

Compared to previous versions, has made significant changes after JDK1.8 in resolving hash collision, when the chain length is greater than the threshold value (the default is 8), the red-black tree into the linked list to reduce the search time.

TreeMap, TreeSet and underlying HashMap after JDK1.8 have used the red-black tree. Red-black tree is a binary search tree in order to solve the defect, because binary search tree in some cases degenerated into a linear structure.

Recommended reading:

Why HashMap length is a power of 2

In order to allow efficient access HashMap, less impact as possible, that is to try to put the data is evenly distributed. We also talked about over the top, range Hash value to the value of -2147483648 2147483647, probably add up to around 4 billion mapping space, as long as the hash function mapping was more evenly loose, general application is difficult to occur collision. But the problem is an array of four billion length, the memory is not fit. So the hash value can not be directly used with. Also the length of the array do first modulo operation before use, in order to obtain the remainder to be stored is the position corresponding to the array index. This array subscript is calculated by "  (n - 1) & hash." (N array representing length). This also explains why HashMap length is a power of two.

How does this algorithm should design it?

First, we might think of using modulo operation% to achieve. However, the focus here: "modulo (%) the operation if the divisor is a power of 2 is equivalent to a reduction of its divisor AND (&) operation (i.e. hash% length == hash & (length -1) provided that the length is the n-th power of 2;). "  and  binary bit operation &,% with respect to the operation efficiency can be improved, which explains why HashMap length is a power of two.

HashMap multi-threaded operating results in infinite loop problem

The main reason is that in the concurrent Rehash will result in a circular linked list is formed between the elements. However, jdk 1.8 solves this problem, but still do not recommend the use of HashMap in a multi-threaded, multi-threaded use because HashMap will still have other problems such as data loss. Concurrent environment recommended ConcurrentHashMap.

For details, see: https://coolshell.cn/articles/9606.html

ConcurrentHashMap and the difference between Hashtable

ConcurrentHashMap and Hashtable difference is mainly reflected in the different ways to achieve thread-safe.

  • Underlying data structure:  of ConcurrentHashMap using JDK1.7 bottom  segment array list +  implemented, the data structure used with JDK1.8 HashMap1.8 like structure, a linked list arrays + / red and black binary tree. Hashtable HashMap underlying data structure and are similar to the previous use JDK1.8  array + chain  form, the array is subject HashMap, the main chain is to solve hash collision exists;
  • Way to achieve thread-safe (important):  ①  when the JDK1.7, ConcurrentHashMap (sub-lock)  on the whole bucket array has been divided segments (Segment), each lock only lock the container in which part of the data, multithreaded access container data from different segments, there would not lock contention and improve concurrency rate. JDK1.8 to have abandoned the concept of time Segment, but directly linked list data structure Node array + + red-black tree to implement concurrency control and synchronized using CAS operated. (JDK1.6 later on synchronized lock to do a lot of optimization)  the whole looks like optimized and thread-safe HashMap, although you can still see the data structure Segment in JDK1.8, but has simplified the property, only to compatible with older versions; ②  Hashtable (same lock)  : use synchronized to ensure thread safety, efficiency is very low. When one thread to access synchronization method, other threads can also access synchronization method, might be blocked or enter a polling state, such as adding elements using a put, you can not add another thread to use elements put, you can not use get, competition will become increasingly fierce the lower the efficiency.

A set of underlying data structures summarized frame

Collection

1. List

  • ArrayList:  Object Array
  • The Vector:  Object Array
  • LinkedList:  doubly linked list (for the circular list before JDK1.6, JDK1.7 canceled cycle)

2. Set

  • HashSet (random, unique):  based HashMap implemented using the underlying storage element to HashMap
  • LinkedHashSet:  LinkedHashSet inherited from HashSet, and its interior is achieved through LinkedHashMap. Somewhat similar to what we said before LinkedHashMap its interior is the same HashMap based implementation, but still a little bit of difference
  • TreeSet (order only):  red-black tree (self-balancing binary tree sort)

Map

  • HashMap:  Before JDK1.8 HashMap + list by the array consisting of an array of HashMap is the main, the list is mainly to resolve hash collision exists ( "zipper method" to resolve the conflict). After JDK1.8 has made significant changes in resolving hash collision, when the chain length is greater than the threshold value (the default is 8), into a red-black tree list to reduce the search time
  • A LinkedHashMap:  a LinkedHashMap inherited from HashMap, so it is still based on the underlying structure is a hash zipper and a list or an array composed of red-black tree. Further, a LinkedHashMap on the basis of the above structure, an increase of a doubly-linked list, so that the above structure can be maintained on the key insertion order. Simultaneously by corresponding operation of the linked list, the access order to achieve associated logic. Details can be viewed: "LinkedHashMap detailed source code analysis (JDK1.8)"
  • Hashtable:  Array + list consisting of an array of HashMap is the main, the list is mainly to resolve hash collision exists
  • TreeMap:  red-black tree (self-balancing binary tree sort)

How to choose the set?

According to the main feature set to choose, for example, we need to key on the selection of the collection at getting to the Map interface element values, need to sort the choice TreeMap, when it does not need to sort selection HashMap, the need to ensure the security thread on the selection ConcurrentHashMap. When we only need to be stored element value, you select the collection implement collection interface, you need to ensure that the only element to achieve the set when selecting the Set interface such as TreeSet or a HashSet, you do not need to choose, such as ArrayList or LinkedList implement the List interface, and then implement them in accordance with It features a collection of interfaces to choose.

 

 

 

 

 

 

 

 

 

 

Published 96 original articles · won praise 32 · views 30000 +

Guess you like

Origin blog.csdn.net/qq_41345773/article/details/104989982