Java Concurrent Programming Fundamentals ()

1. Basic concepts:
○Synchronization and asynchrony:
Synchronization and asynchrony are usually used to describe a method call. Once a synchronous method starts, the caller must wait until the method call returns before continuing any further actions. Async methods are more like message passing.

○ Concurrency and parallelism:
Strictly speaking: multiple tasks in parallel are actually carried out at the same time; concurrency is that this process is carried out alternately. For external observers, even if multiple tasks are serially concurrent, it will cause Multitasking is the illusion of parallel execution.
○ Blocking and non-blocking:
used to describe the interaction between multiple threads.

○ Java Memory Model (JMM): Built around the atomicity, visibility, and ordering of multiple threads.

○ The difference between a process and a thread:

○The life cycle of a thread: new runnable, waiting. . .

○Basic operations of threads:

○The difference between Start() and run(): The
Start() method will create a new thread and let this thread execute the run() method;
if only the Run() method is called, run() will only be executed serially in the current thread code in;

○The difference between aborting the thread stop() and thread interrupting:
Generally speaking, the thread will be accepted after the execution is completed, and there is no need to close it manually, but for some background resident threads of the server, an external method needs to be provided; the
Stop() method returns Terminate the thread immediately. Interrupt() is not recommended
. It informs the thread that it is ready to interrupt and will not exit immediately.

○Wait() and notify() must be included in the corresponding synchronizid;

○Wait for the thread to end (join) and yield (yield):
For example, the number of a thread is 1000. If the thread is called in the main thread and the number is printed, if join is not used, the number may not be 1000. If join is called in the thread, it will wait until 1000 before executing the following printing code.
Yield() is a static code. Once executed, it will make the current thread give up the cpu and re-compete for cpu resources.

○Three methods of using Synchronized:
Specify the lock object:
directly act on the instance method: equivalent to locking the current instance, before entering the synchronization code: need to obtain the lock of the current instance.
Acting directly on static methods: it is equivalent to locking the current class. Before entering the synchronization code, the lock of the current class must be obtained.

○Reentrant lock: ReentrantLock:
The definition of reentrant is: the so-called reentrant lock refers to the unit of thread. When a thread acquires the object lock, the thread can acquire the lock on the object again, while other Threads are not allowed.
ReentrantLock can set fair lock, the condition of reentrant lock: await(), signal(), signalall() method;

Readwritelock read-write lock, read and read are not mutually exclusive, read and write are mutually exclusive, write and write are mutually exclusive;

○Countdown timer: CountDownLatch is implemented by a counter, and the initial value of the counter is the number of threads. Every time a thread completes its task, the value of the count is decremented by 1. When the counter value reaches 0, it indicates that all threads have completed the task, and then the thread waiting on the lock can resume the task.
There is a countDown() method, which is a decrement operation in other threads.

✦Thread pool problems:
* 1. Thread pool: A thread queue is provided, and all threads in the waiting state are stored in the queue. Avoid creating and destroying additional overhead and improve the response speed.

2. The architecture of the thread pool:
java.util.concurrent.Executor : the root interface responsible for the use and scheduling of threads
|-ExecutorService sub-interface: the main interface of the thread pool|-AbstractExecutoServicer
abstract class
|-ThreadPoolExecutor thread pool implementation class
| –ScheduledExecutorService sub-interface: responsible for thread scheduling
| –ScheduledThreadPoolExecutor: inherits ThreadPoolExecutor and implements ScheduledExecutorService

3. Tool class: Executors
ExecutorService newFixedThreadPool() : Create a fixed size thread pool
ExecutorService newCachedThreadPool() : Cache thread pool, the number of thread pools is not fixed, and the number can be automatically changed according to demand.
ExecutorService newSingleThreadExecutor() : Creates a single thread pool. There is only one thread in the thread pool

ScheduledExecutorService newScheduledThreadPool() : Creates a fixed-size thread that can execute tasks delayed or scheduled.

The tool class Executors, in fact, uses TreadPoolExecutor internally, so you need to understand TreadPoolExecutor:
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
BlockingQueue workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler);

corePoolSize: The size of the core pool
maximumPoolSize: The maximum number of threads in the thread pool, which indicates how many threads can be created in the thread pool at most
keepAliveTime: Indicates how long the thread will be terminated when there is no task execution. By default, keepAliveTime will work only when the number of threads in the thread pool is greater than corePoolSize, until the number of threads in the thread pool is not greater than corePoolSize, that is, when the number of threads in the thread pool is greater than corePoolSize, if a thread is idle When the time reaches keepAliveTime, it will be terminated until the number of threads in the thread pool does not exceed corePoolSize
unit: the time unit of the parameter keepAliveTime, there are 7 values.

Some operations after a task is submitted to the thread pool:
a. If the number of threads in the current thread pool is less than corepoolsize, create a thread to execute each task;
b. If the number of threads in the current thread pool >=corepoolsize, It will try to add the task to the task buffer queue. If the addition is successful, the task will wait for the idle process to take it out for execution. If the addition fails, it will try to create a thread to execute the task;
c. Make up for the number of threads in the current thread pool >=Maximumpoolize, then adopt the rejection policy:
1. abortpolicy discards the task and throws RejectedExecutionException
2. discardpolicy refuses to execute, no exception is thrown
3. discardoldestpolicy discards the oldest task in the task buffer queue and tries to resubmit a new task
4. callerrunspolicy There are feedback mechanisms that make task submissions slower.

✦LinkedList is a thread-unsafe, efficient read-write queue: ConcurrentLinkedQueue is thread-safe, and concurrentLinkedQueue is the best-performing queue in a high-concurrency environment;

✦CopyOnWriteArrayList class: No lock is required for reading, and writing does not block read operations. Synchronous waits are only required between writes and writes. The so-called CopyOnWrite is to perform a self-copy during the write operation, that is, when the List needs to be modified, I do not modify the original content, but copy the original data once, and write the modified content into the copy. After writing, replace the original data with the modified copy.

✦Data sharing channel: BlockingQueue, this is an interface; the implementation classes are: ArrayBlockingQueue is implemented based on arrays and is suitable for bounded queues, and LinkedBlocking is suitable for unbounded queues. Take arrayBlockingQueue as an example, pay attention to put() and take(): the put() method also pushes elements into the end of the queue, but if the queue is full, it will wait until there is a free position in the queue. Take() takes the element from the queue, but if the queue is empty, it will wait forever. The underlying implementation principle mainly uses the await() and signal() methods of condition in Reentrantlock

✦Lock optimization:
reduce lock holding time;
reduce lock granularity: such as ConcurrentHashMap.
Lock separation: For example, arrayblockingQueue separates take() and put(), because it acts on the front and rear of the queue and does not conflict.
Lock coarsening: When a lock is requested within a loop, the lock can be placed outside the loop.
The implementation principle of HashMap
and allow the use of null values and null keys. This class does not guarantee the order of the map, in particular it does not guarantee that the order will be permanent. The bottom layer of HashMap is an array structure, and each item in the array is a linked list. When a new HashMap is created, an array is initialized.

Entry is a static class, which contains key and value, that is, key-value pairs, and also contains a next Entry pointer. We can conclude that an Entry is an element in an array, and each Entry is actually a key-value pair that holds a reference to the next element, which constitutes a linked list. (entry is actually a singly linked list)

When we put, if the key exists, the new value will replace the old value, and if the key exists, the method returns the old value, and if the key does not exist, it returns null.

When we put an element into the HashMap, first recalculate the hash value according to the hashCode of the key, and obtain the position (ie subscript) of this element in the array according to the hash value. If there are other elements already stored in the position of the array, then The elements in this position will be stored in the form of a linked list, the newly added elements are placed at the head of the chain, and the first added elements are placed at the end of the chain. If there is no element at this position in the array, the element is placed directly at that position in this array.

Why is the size of the hashmap 2 to the Nth power?
Reference: https://www.cnblogs.com/chenssy/p/3521565.html
Because the hash uses h&(length-1) when the hashmap calculates the insertion position, the bit AND operation is faster than the % operation in the computer, while Using length-1, if the length is 15, then the length-1 is 14, which is 1110. When the AND operation is performed, no matter whether it is 1 or 0 at the end, the result of the AND is 0, which increases and the hash value collides. If the length is 16, then the length-1 is 15, and the binary is 1111, then the AND operation is the number itself and will not increase the probability of collision.
When length = 2^n, the probability of different hash values colliding is relatively small, which will make the data evenly distributed in the table array and the query speed will be faster.

Expansion of Haspmap: When will HashMap be expanded? When the number of elements in the HashMap exceeds the array size *loadFactor, the array will be expanded. The default value of loadFactor is 0.75, which is a compromise value. That is to say, by default, the size of the array is 16, then when the number of elements in the HashMap exceeds 16*0.75=12, the size of the array is expanded to 2*16=32, that is, doubled, and then recalculated the position of each element in the array, which is a very expensive operation

Fail-fast mechanism:
We know that java.util.HashMap is not thread-safe, so if other threads modify the map in the process of using the iterator, ConcurrentModificationException will be thrown, which is the so-called fail-fast strategy.

When a thread A traverses a collection through iterator, if the content of the collection is changed by other threads; then when thread A accesses the collection, a ConcurrentModificationException will be thrown and a fail-fast event will be generated. The implementation of this strategy in the source code is through the modCount field. As the name implies, modCount is the number of modifications. Modifications to the content of HashMap (of course not only for HashMap, but also for others such as ArrayList) will increase this value (you can go back and forth). Looking at its source code, there is a sentence modCount++ in many operations), then this value will be assigned to the expectedModCount of the iterator during the initialization of the iterator. In the iterative process, judge whether modCount and expectedModCount are equal. If they are not equal, it means that other threads have modified the Map:
note that modCount is declared as volatile to ensure the visibility of modifications between threads.
The fail-fast mechanism is an error detection mechanism. It can only be used to detect errors
. The underlying implementation of hashset is the add method in HashSet implemented by hashmap
, which calls the put() method in the underlying HashMap. If put is called in HashMap, it will first determine whether the key exists. If If the key exists, modify the value value. If the key does not exist, insert the key-value. In the set, because the value value is useless, there is no such thing as modifying the value value. Therefore, when adding elements to the HashSet, first determine whether the element (that is, the key) exists. If it does not exist, insert it. If it exists, do not insert it. , so that there are no duplicate values in the HashSet.

The simple difference between Hashtable and hashmap:
1. The key and value of HashMap are both allowed to be null, while the key and value of Hashtable are not allowed to be null. When HashMap encounters the key as null, it calls the putForNullKey method to process it, but does not process the value; Hashtable directly returns NullPointerException when it encounters null.
2. Hashtable method is synchronous, while HashMap is not. We can look at the source code, almost all public methods in Hashtable are synchronized, and some methods are also implemented internally through synchronized code blocks. So some people generally suggest to use HashTable if multi-thread synchronization is involved, and HashMap if not involved, but there is a static method in the Collections class: synchronizedMap(), which creates a thread-safe Map object and stores it Returned as an encapsulated object. Use put() method to put elements into map ,
use add() method to put elements into set In Table, it is additionally independent. LinkedHashMap implements sorting in insertion order or access order by inheriting from hashMap and adding two properties before, after, and header to combine to form a doubly linked list. ConcurrentHashMap

Entry<K,V> headerEntry<K,V>Entry<K,V>

Overview of ConcurrentHashMap
Although HashTable and the one wrapped by the synchronization wrapper HashMap(Collections.synchronizedMap(Map<K,V> m) )can replace HashMap in concurrent scenarios, they both use a global lock to synchronize concurrent access between different threads, so they will bring performance problems that cannot be ignored.

So generally speaking, the data structure of ConcurrentHashMap is an array of Segments, the data structure of Segment is an array of HashEntry, and HashEntry stores our key-value pairs, which can form a linked list.

In ConcurrentHashMap, both read and write operations can guarantee high performance: during read operations (almost) no locks are required, while during write operations, only the segment to be operated is locked through the lock segmentation technique. Does not affect client access to other segments. In particular, in an ideal state, ConcurrentHashMap can support 16 threads for concurrent write operations (if the concurrency level is set to 16), and any number of threads for read operations

In fact, it is a double hash process. The first hash is to find the corresponding segment. Each segment contains a table array. The element of this table array is hashEntry. This hashentry is a four-tuple, which includes key, value, ...;
table is A typical linked list array is also volatile, which makes any updates to the table immediately visible to other threads.

HashEntry is used to encapsulate specific key-value pairs, which is a typical four-tuple. Similar to Entry in HashMap, HashEntry also includes the same four fields, namely key, hash, value and next. The difference is that in the HashEntry class, the key, hash and next fields are declared final, and the value field is modified by volatile, so the HashEntry object is almost immutable, which is one of the ConcurrentHashmap read operations that do not require locking. important reason. The declaration of the next field as final itself means that we cannot add or remove nodes from the middle or end of the hash chain, because this requires modifying the next reference value, so all node modifications can only start from the head. For put operations, they can always be added to the head of the Hash chain. However, for the remove operation, it may be necessary to delete a node from the middle, which requires copying (renewing) all the nodes in front of the node to be deleted, and the last node points to the next node of the node to be deleted.
Since the value field is modified by volatile, it can ensure that the latest value is read by the read thread, which is another important reason why the ConcurrentHashmap read operation does not require locking. HashEntry represents a node in the hash chain.
ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel)
　　This constructor is intended to construct a file with the specified capacity, specified load factor and specified number of segments/concurrency level (if not a power of 2, it will be adjusted to a power of 2 ) of an empty ConcurrentHashMap
ConcurrentHashMap is different from HashMap in that it neither allows the key value to be null nor the value to be null. In addition, we can also see that, in fact, our put operation on ConcurrentHashMap is implemented by ConcurrentHashMap delegated to a specific segment. That is to say, when we put a Key/Value pair into ConcurrentHashMap, we will first obtain the hash value of the Key and hash it again, and then locate the segment into which this record should be inserted according to the final hash value,
ConcurrentHashMap The put operation on the Segment is done with a lock. In Section 2, we already know that Segment is a subclass of ReentrantLock, so Segment itself is a reentrant Lock, so we can directly call its inherited lock() method and unlock() method to update the code. lock/unlock.
As mentioned above, before using the put operation to insert a Key/Value pair in ConcurrentHashMap, it will first check whether the insertion will cause the number of nodes in the segment to exceed the threshold threshold. If so, then expand and re-hash the segment first. . It should be noted that the re-hashing of ConcurrentHashMap is actually a re-hashing of a segment of ConcurrentHashMap, so the bucket positions contained in each segment of ConcurrentHashMap are naturally different.

In general, the mystery of the fact that ConcurrentHashMap does not require locking for read operations lies in the following three points:
• Use the immutability of HashEntery objects to reduce the need for locking in read operations;
• Use Volatile variables to coordinate memory visibility between read and write threads;
• If instruction reordering occurs during reading, lock and reread;
in ConcurrentHashMap, some operations need to involve multiple segments. For example, the main idea of size operation size method is to first sum up the sizes of all segments without locks , this summation strategy executes RETRIES_BEFORE_LOCK times at most (the default is twice): the summation operation will continue to try to execute until RETRIES_BEFORE_LOCK is not reached (this is because other threads may be structuring the segments that have been traversed during the traversal process update); after RETRIES_BEFORE_LOCK is exceeded, sum all segment sizes while holding all segment locks if not successful. In fact, in the process of accumulating the count operation, the probability of the previously accumulated count changing is very small, so the method of ConcurrentHashMap is to first try RETRIES_BEFORE_LOCK times to count the size of each segment by not locking the segment. When the count of the container changes, the lock method is used to count the size of all segments.
　　So, how does ConcurrentHashMap determine that the segment of the container has been structurally updated during statistics? As we already know in the previous article, Segment contains a modCount member variable. In all operations (put operations, remove operations, and clean operations) that cause structural changes to the segment, the variable modCount will be incremented by 1. Therefore, JDK only needs to By comparing whether modCount changes before and after counting the size, you can know whether the size of the container has changed.

Java Concurrent Programming Fundamentals ()

Guess you like