When the Ali interviewer asked me HashMap, I knew my opportunity was coming

Speaking of daily back-end development work, collections are a tool that is used very frequently, and HashMap is a good helper for processing business logic. At the same time, the underlying implementation and principles of HashMap have become a frequent visitor to interview questions.

I have summarized the 20 classic HashMap questions for Java interviews with major manufacturers, and I hope it will help you! (Attach the latest and detailed interview information recently compiled at the end of the article!)

æååæ © å®å ± å¿å¿çµé¸¡æ ± ¤ç »å¸è¯å½éå¾

1: Data structure of HashMap?

A: Hash table structure (linked list hash: array + linked list) is implemented, combining the advantages of arrays and linked lists. When the length of the linked list exceeds 8, the linked list is converted to a red-black tree.

transientNode\[\]table;

2: How does HashMap work?

The bottom layer of HashMap is a hash array and a singly linked list. Each element in the array is a linked list, which is implemented by the Node internal class (implementing the Map.Entry interface). HashMap is stored and retrieved through the put & get method.

When storing an object, pass the K/V key value to the put() method:

①, call the hash(K) method to calculate the hash value of K, and then combine the length of the array to calculate the array subscript;

②. Adjust the size of the array (when the number of elements in the container is greater than capacity * loadfactor, the container will expand and resize to 2n);

③, i. If the hash value of K does not exist in the HashMap, the insertion is performed, and if it exists, a collision occurs;

ii. If the hash value of K exists in HashMap, and both of them return true equals, then update the key-value pair;

iii. If the hash value of K exists in the HashMap, and both of them return false equals, it is inserted at the end of the linked list (tail interpolation) or in the red-black tree (the way the tree is added). (Head interpolation was used before JDK 1.7, and tail interpolation was used in JDK 1.8) (Note: When the collision causes the linked list to be larger than TREEIFY_THRESHOLD = 8, the linked list is converted to a red-black tree)

When getting an object, pass K to the get() method:

① Call the hash(K) method (calculate the hash value of K) to obtain the array subscript of the linked list where the key value is located;

②. Traverse the linked list sequentially, and the equals() method finds the V value corresponding to the K value in the same Node linked list.

HashCode is located and stored; equals is qualitative and compares whether the two are equal.

3. What happens when the hashCode of two objects are the same?

Because the hashCode is the same, it is not necessarily equal (equals method comparison), so the subscripts of the array where the two objects are located are the same, and the "collision" occurs. And because HashMap uses a linked list to store objects, this Node will be stored in the linked list.

4. Do you know the implementation of hash? Why do you want to achieve this?

In JDK 1.8, it is implemented through the high 16-bit XOR of hashCode(): (h = k.hashCode()) ^ (h >>> 16), mainly for speed, efficiency and quality. , To reduce the overhead of the system, and will not cause collisions caused by the high bits not participating in the calculation of the subscript.

5. Why use the XOR operator?

It is guaranteed that as long as one bit of the object's hashCode's 32-bit value changes, the entire hash() return value will change. Minimize collisions as much as possible.

6. How to determine the capacity of the HashMap table? What is loadFactor? How does this capacity change? What problems will this change bring?

① The size of the table array is determined by the capacity parameter, the default is 16, or it can be passed in during construction, the maximum limit is 1<<30;

②, loadFactor is the load factor, the main purpose is to confirm whether the table array needs to be dynamically expanded. The default value is 0.75. For example, when the table array size is 16, and the load factor is 0.75, the threshold is 12, and when the actual size of the table exceeds 12 , The table needs dynamic expansion;

③. When expanding, call the resize() method to double the length of the table (note that the length of the table, not the threshold)

④. If the data is very large, performance loss will be caused during expansion. In places with high performance requirements, this loss may be fatal.

7. What is the process of put method in HashMap?

Answer: "Call the hash function to obtain the hash value corresponding to the Key, and then calculate the array subscript;

If there is no hash conflict, put it directly into the array; if there is a hash conflict, put it behind the linked list in the form of a linked list;

If the length of the linked list exceeds the threshold (TREEIFY THRESHOLD==8), the linked list is converted to a red-black tree, and the linked list is less than 6, the red-black tree is converted back to the linked list;

If the key of the node already exists, just replace its value;

If the key-value pairs in the collection are greater than 12, call the resize method to expand the array. "

8. What is the process of array expansion?

Create a new array with twice the capacity of the old array, and recalculate the storage location of the nodes in the old array. There are only two positions of nodes in the new array, the original subscript position or the original subscript + the size of the old array.

9. The problem of excessively deep linked list caused by zipper method, why not use binary search tree instead of red-black tree? Why not always use red-black trees?

The reason for choosing the red-black tree is to solve the defects of the binary search tree. In special circumstances, the binary search tree will become a linear structure (this is the same as the original linked list structure, which causes deep problems), traverse search It will be very slow.

The red-black tree may need to be left-handed, right-handed, and color-changing operations to maintain balance after inserting new data. The red-black tree is introduced to find data blocks and solve the problem of the query depth of the linked list. We know that the red-black tree is a balanced binary tree. But in order to maintain "balance", there is a price to pay, but the cost is less resources than traversing the linear linked list, so when the length is greater than 8, the red-black tree will be used, if the length of the linked list is very short, it is not at all Need to introduce red-black trees, but the introduction will be slow.

10. Tell me about your opinion on the red-black tree?

Each node is either red or black

The root node is always black

If the node is red, its child nodes must be black (not necessarily vice versa)

Each leaf node is a black empty node (NIL node)

Each path from the root node to the leaf node or empty child node must contain the same number of black nodes (that is, the same black height)

11. What changes have been made to HashMap in jdk8?

In Java 1.8, if the length of the linked list exceeds 8, then the linked list will be converted to a red-black tree. (The number of buckets must be greater than 64, when it is less than 64, it will only expand)

When a hash collision occurs, java 1.7 will be inserted at the head of the linked list, and java 1.8 will be inserted at the end of the linked list

In java 1.8, Entry was replaced by Node (change of a vest).

12. What is the difference between HashMap, LinkedHashMap and TreeMap?

LinkedHashMap saves the insertion order of records. When traversing with Iterator, the first fetched record must be inserted first; traversal is slower than HashMap;

TreeMap implements the SortMap interface, which can sort the records it saves according to the key (the default key value is in ascending order, and the sorting comparator can also be specified)

13. What are the usage scenarios of HashMap & TreeMap & LinkedHashMap?

In general, the most used is HashMap.

HashMap: When inserting, deleting and positioning elements in the Map;

TreeMap: In the case where the keys need to be traversed in natural order or custom order;

LinkedHashMap: In the case where the order of output is the same as the order of input.

14. What is the difference between HashMap and HashTable?

①. HashMap is thread-unsafe, HashTable is thread-safe;

②. Because of thread safety, the efficiency of HashTable is not as efficient as HashMap;

③, HashMap only allows the key of one record to be null, and allows the value of multiple records to be null, but HashTable does not allow it;

④, HashMap default initialization array size is 16, HashTable is 11. When the former expands, it expands twice, the latter expands twice+1;

⑤, HashMap needs to recalculate the hash value, while HashTable directly uses the hashCode of the object

15. What is another thread-safe class in Java that is very similar to HashMap? It is also thread safe. What is the difference between it and HashTable in terms of thread synchronization?

ConcurrentHashMap class (a thread-safe and efficient HashMap implementation provided in the Java concurrency package java.util.concurrent).

HashTable is the principle of using the synchronize keyword to lock (that is, to lock the object);

For ConcurrentHashMap, segmented locking is adopted in JDK 1.7; CAS (lock-free algorithm) + synchronized is directly adopted in JDK 1.8.

16. What is the difference between HashMap & ConcurrentHashMap?

Except for locking, there is not much difference in principle. In addition, the key-value pairs of HashMap allow null, but ConCurrentHashMap does not allow it.

17. Why is ConcurrentHashMap more efficient than HashTable?

HashTable uses a lock (locking the entire linked list structure) to handle concurrency issues. Multiple threads compete for a lock, which is easy to block;

ConcurrentHashMap

The segmented lock (ReentrantLock + Segment + HashEntry) is used in JDK 1.7, which is equivalent to dividing a HashMap into multiple segments and assigning a lock to each segment, which supports multi-threaded access. Lock granularity: Based on Segment, it contains multiple HashEntry.

CAS + synchronized + Node + red-black tree is used in JDK 1.8. Lock granularity: Node (first node) (implements Map.Entry). The lock granularity is reduced.

18. A specific analysis of the ConcurrentHashMap lock mechanism (JDK 1.7 VS JDK 1.8)?

In JDK 1.7, a segmented lock mechanism is adopted to implement concurrent update operations. The bottom layer uses an array + linked list storage structure, including two core static internal classes, Segment and HashEntry.

①, Segment inherits ReentrantLock (reentrant lock) to act as a lock, each Segment object guards several buckets of each hash map;

②, HashEntry is used to encapsulate the key-value pairs of the mapping table;

③, each bucket is a linked list linked by several HashEntry objects

In JDK 1.8, Node + CAS + Synchronized is used to ensure concurrency safety. Cancel the class Segment and directly use the table array to store key-value pairs; when the length of the linked list composed of HashEntry objects exceeds TREEIFY_THRESHOLD, the linked list is converted to a red-black tree to improve performance. The bottom layer is changed to array + linked list + red-black tree.

19. ConcurrentHashMap in JDK 1.8, why use the built-in lock synchronized instead of the reentrant lock ReentrantLock?

① The particle size is reduced;

② The JVM development team has not given up on synchronization, and the optimization space for synchronized optimization based on JVM is larger and more natural.

③ Under the large amount of data operation, for the memory pressure of JVM, API-based ReentrantLock will spend more memory.

20. A brief introduction to ConcurrentHashMap?

① Important constants:

private transient volatile int sizeCtl;

When it is a negative number, -1 means initializing, -N means N-1 threads are expanding;

When it is 0, it means that the table has not been initialized;

When it is another positive number, it indicates the size of initialization or next expansion.

② Data structure:

Node is the basic unit of storage structure, inherits Entry in HashMap, used to store data;

TreeNode inherits Node, but the data structure is replaced by a binary tree structure, which is the storage structure of the red-black tree, used to store data in the red-black tree;

TreeBin is a container that encapsulates TreeNode and provides some conditions and lock control for converting red-black trees.

③ When storing objects (put() method):

If it is not initialized, call the initTable() method to initialize;

If there is no hash conflict, directly CAS lock-free insertion;

If you need to expand the capacity, expand the capacity first;

If there is a hash conflict, locks are added to ensure thread safety. There are two cases: one is that the linked list is directly traversed to the end and inserted, the other is that the red-black tree is inserted according to the red-black tree structure;

If the number of the linked list is greater than the threshold 8, it must first be converted into a red-black tree structure, and break enters the loop again

If the addition is successful, call the addCount() method to count the size and check whether it needs to be expanded.

④ Expansion method transfer(): The default capacity is 16, when expanding, the capacity becomes twice the original.

helpTransfer(): Call multiple worker threads to help expand the capacity, which will be more efficient.

⑤ When obtaining an object (get() method):

Calculate the hash value, locate the table index position, and return if the first node matches;

If it encounters expansion, it will call the ForwardingNode.find() method of the node marking the expansion node, find the node, and return if it matches;

If none of the above matches, traverse the nodes down and return if the match is matched, otherwise it will return null at the end.

20. What is the concurrency of ConcurrentHashMap?

The maximum number of threads that can simultaneously update ConccurentHashMap without lock contention while the program is running. The default is 16, and can be set in the constructor.

When the user sets the concurrency, ConcurrentHashMap will use the smallest power of 2 exponent greater than or equal to this value as the actual concurrency (if the user sets the concurrency to 17, the actual concurrency is 32)

At last:

Information collection: poke here, poke here, password: CSDN

The above is the whole content of this article. I hope it will be helpful to everyone. I wish you an early promotion and salary increase, marrying Bai Fumei and embarking on the pinnacle of life!

Information collection: poke here, poke here, password: CSDN

If you like the article, please pay attention, Sanlian~~

Insert picture description here

When the Ali interviewer asked me HashMap, I knew my opportunity was coming

Guess you like