"Art Java concurrent programming" study notes (b)

Chapter 2 Java Concurrency underlying mechanisms of implementation principle

        After the code is compiled into Java byte code, bytecode classes loaded into the JVM, JVM byte code execution, is converted to the final assembler instructions which execute on the CPU, concurrency in Java used depends on the JVM implementation and CPU instructions .

2.1 volatile applications

        In concurrent programming synchronized and volatile have an important role, volatile lightweight synchronized, to ensure the visibility of shared variables.

  Visibility means that when a thread modifies a shared variable, another thread can read the value of this modification.

  If volatile variable modifier properly, will be lower than the cost of execution and synchronized, because volatile and does not cause scheduling a thread context switch.

    1, the definition and implementation of the principles of volatile

        In the Java language specification for the definition of volatile of the following: Java programming language allows threads to access shared variables, variables can be shared in order to ensure accurate and consistent updates, thread should ensure to ensure that access to this variable alone through an exclusive lock. Java provides volatile, and in some cases more convenient than lock.

        And volatile implementation dependent CPU terms: memory barrier (memory barriers), buffer line (cache line), the atomic operation (atomic operations), the buffer line fill (cache line fill), a cache hit (cache hit), the write hit (write hit ), write miss (write misses the cache)

        How volatile to ensure the visibility ? Modified volatile shared variables will be more added lock assembly code write operation, lock instruction law two things multicore processors:

        1) The current data processor buffer line is written back to system memory

        2) The write-back memory will operate in the other CPU caches data in the memory address is not valid

        In multi-processor, in order to ensure each processor cache coherency, it will cache coherency protocol, each processor to check their cache is not expired by sniffing data traveling over the bus, when the processor finds himself memory address corresponding to the cache line is modified, the processor stores the current cache line set to the inactive state, when the processor of the data modification operations, the memory system from the re-read data in the processor cache.

        volatile realization of two principles:

        1) Lock prefix instructions cause the processor cache is written back to memory.

        2) a processor's cache memory is written back to causes other processor's cache is invalid

    2, volatile optimize the use of

        JDK new contract and a set of queues in 7 class LinkedTransferQueue, which in use volatile variables, additional bytes in a way to optimize the queue out, into the team performance. LinkedTransferQueue in the PaddedAtomicRefernce inner class only had 1 thing, speaking volatile shared variables added to the 64 bytes (an object reference occupies four bytes, an additional 15 variables is 60 bytes, plus the value of the variable parent class, a total of 64 bytes). For 64-bit processors, the additional 64 bytes of concurrent programming efficiency can be improved, because processor 64 does not support the partially filled line buffer, which means that, if the queue head node and tail node is less than 64 bytes, they processor will read a line buffer, each processor will cache the same head, tail node under more processors, one processor when the locking modify buffer line, the cache coherency mechanism in effect, will cause the processor can access the end of the other nodes in its own cache, resulting in low queue enqueue dequeue efficiency. After an additional byte, avoiding the head node and tail node in a buffer line may be such that the node does not lock each other when the head and tail modified.

     When the cache line is not a 64 byte wide processor; frequently shared variable is not written (since the approach requires the use of additional byte processor to read more bytes to buffer cache, it will bring some performance consumption; if the shared variable is not frequently write, lock chance is very small, but also no need to avoid locking each other by an additional byte mode), there is no need to add to 64 bytes.

  However, this additional byte mode may not take effect in the next Java7, because Java7 will eliminate useless or rearrange fields, need to use other additional byte mode.

Principles and Applications 2.2 synchronized implementation

        synchronized heavyweight lock, various optimized Java SE 1.6 Dui synchronized, not so heavy. Java SE 1.6 in order to obtain and release locks to reduce the performance overhead caused problems introduces measures tend to lock, lock and other lightweight optimization.

        synchronized basis to achieve synchronization: every object in Java can be used as a lock. Specific forms: ordinary synchronization method, the lock is the current instance of the object; static synchronization method, the lock is Class object of the current class; for block synchronization code, the object lock is synchronized brackets configuration. When a thread tries synchronized block / synchronous access method, you must first obtain a lock, the lock must be released when exit or throw an exception.

    synchronized in the JVM implementation principle is: JVM based synchronization enter and exit Monitor object implementation methods or code blocks. But the two are not the same implementation details. Sync block is implemented using the monitorenter and monitorexit instructions, while other synchronization method embodiment. However, the synchronization method may be used to achieve these two instructions.

    2.2.1 Java object header

        The lock is synchronized with the advance of the presence of Java objects. If the object is an array type, the virtual machine 3 stores word-wide object header, the non-use 2-word width array type storage object header. 1 is equal to 4 bytes wide.

    Java object header: Mark Word (hashCode memory object, generational age, lock information, etc.), Class Metadata Address (pointer to an object type stored in the data), Array Length (length of the array (if the current object is an array))

    Comparison 2.2.2 upgrade and lock

        In JavaSE1.6, lock status are: lock-free status, tend to lock status, lock status lightweight, heavyweight lock status . These states will escalate as competition. Lock can be upgraded, but can not downgrade, the purpose is to improve the efficiency of obtaining and release locks.

    1 biased locking

       In most cases, the lock is not only the absence of multi-threaded competition, and always get several times by the same thread, a thread to acquire a lock in order to allow a lower cost, the introduction of bias lock  . When a thread to acquire the lock and access the sync block, will be locked in the object header and recording the stack frame stored inside biased locking thread ID of the thread after entering and exiting CAS sync block is not required to operate locking and unlocking, only need to test the object head Mark Word is stored in the thread pointing to bias lock. If the test is successful, indicates that the thread has received lock; if the test is not successful, you need to tend to lock in test Mark Word flag is set to 1 (indicating the current state has been in favor lock): If there is no set, the CAS lock competition mechanism to lock contention; if set, the CAS will try to use the biased locking the object head points to the current thread.

     (1) biased locking revocation

        The use of a biased locking mechanism until it releases the lock of competition there, so when the other thread tries to lock biased competition, held biased locking thread will be released.

        Biased locking revocation, need to wait until global security points (without the byte code is executing at this point in time). It will first have to suspend biased locking thread, and then check whether the thread holds biased locking alive, if not active, the object is first set to lock-free state; if the thread is still alive, with a biased locking of the stack will be executed traversing the object lock biased record, mark Word stack of records and lock the object head either re biased in favor of other threads, or no lock or tag to restore the object is not suitable as biased locking the last wake-up thread.

     (2) Close biased locking

    Java 6 and Java7 default is to start biased lock. Is only activated a few seconds after the application starts. JVM can use parameters -XX: BiasedLockingStartupDelay = 0 set to start immediately, closing delay. You can use -XX: -UsebiasedLocking-false biased closed lock, then the program to enter the lightweight lock state.

    2 Lightweight lock

    (1) lightweight lock lock

        Thread prior to performing synchronized block, JVM will be created in the current thread's stack frame space for storing the locks, and the object head of Mark Word copied to lock the record, officially known as "Displaced Mark Word". The thread then attempts to replace the object header MarkWord lock record to point to pointers CAS. If successful, the current thread to acquire the lock, and if that fails, represents another thread lock contention, the current thread will try to use to get the spin lock.

    (2) lightweight lock unlocked

        Lightweight when unlocked, will use artificial atoms CAS Displaced Mark Word will replace head back to the object, if successful, it means that there is no competition occurs. If it fails, there is competition represents the current lock, the lock will swell heavyweight lock.

        Because the spin will consume CPU, in order to avoid unwanted spin (such as thread to acquire a lock to be blocked), once heavyweight lock to lock escalation, will not return to the lightweight lock state. When the lock is in this state, another thread is trying to acquire the lock will be blocked when the thread releases the lock holding the lock will wake up these threads, the thread will be awakened a new round of wins battle lock.

    3, comparing the advantages and disadvantages of lock

 
 

The principle of operation of 2.3 atomic

    Atomic operation means that this operation is not performed subdivision must be completed all at once executed, is interrupted to perform another operation can not be performed after the part.

    1, related terms

    Cache line (cache line), compare and exchange (Compare and Swap), CPU pipeline (CPU pipeline), memory ordering conflict

    2, the processor is how to achieve atomic operation

        For the processor, that is to say at the same time the atomic operation only one processor processes the data, and the atomic operation is indivisible. There are two common ways: one is to ensure the atomicity by a bus lock, and the other is guaranteed by atomic cache lock.

        Bus key: the bus lock is to use a LOCK # signal provided by the processor, this processor output when a signal on the bus, requesting other processors will be blocked to live, so that the signal can be issued exclusively processor memory, to ensure atomic operation.

        Cache lock: the cache lock is designed to optimize bus lock out. Because the bus during the lock is locked, other processors can not handle other data, can only wait for the lock is released. But if the lock on the cache can reduce this impact. He was referring to the memory area in the cache if the cache line processor, and is locked during the Lock operation, then when it performs the lock operation is written back to memory, the processor is not profess Lock signal on the bus, directly modify the internal memory address with his cache coherency mechanism to ensure the atomic operation, because the cache coherency mechanism prevents data simultaneously modify the memory area by two or more processor cache, write back to the other processor's cache is locked data lines, make the cache line is invalid.

  But in both cases the processor does not use the cache locking:

    1) The first case is: when the data can not be operated within the processor cache, or operations across multiple data cache line (cache line), the processor calls a bus lock;

    2) The second situation is: some processors do not support cache lock. Even if the lock memory area will be called in the cache line processor bus lock.

  For the above two mechanisms can be implemented using Lock prefix instructions.

    3, Java how to implement atomic operations

            In Java using a lock and circulating CAS manner to guarantee an atomic operation

           (1)  cycle CAS: the JVM to use the CAS operation officially CMPXCHG instructions provided by the processor implementation. The basic idea is to be operated by CAS cycle until successful.

            (2) CAS operation of the Three Atomic

                  1) ABA problem

          Solutions is a version number, the version number is added in front of the variable, each variable update when the version number is incremented. ABA to solve the problem using JDK1.5 AtomicStampedReference; compareAndSet action of this class is the method checks the current reference is equal to the expected reference and checks whether the current flag is equal to the expected flag, if all equal Atomically the flag and the reference value to the given updated value

        2) long cycle time overhead large. If the time is not successful spin CAS, it will bring a very large CPU execution cost.

          If the JVM can support pause instructions provided by the processor, so there will be some efficiency improvement.

          pause command has two effects: first, it can delay the pipelined execution of instructions (de-pipeline), so that the CPU does not consume too many resources to perform, the delay time of implementation dependent, in some processor time delay It is zero;

                    Second, it can avoid exiting the loop when the order due to memory conflicts (Memory Order Violation) caused by CPU pipeline is cleared (CPU Pipeline Flush), in order to improve the efficiency of the CPU.

        3) ensure that only a shared variable atomic operation

          Solutions can be multiple shared variables are combined into a shared variable to operate. Is about to put an object in multiple variables to CAS operation.

             (3) Lock mechanism: lock mechanism ensures that only the memory area of the thread to acquire a lock to operate the lock, but multiple locks in Java, in addition to the biased locking, JVM is a way to achieve lock loop CAS, that is, when a thread wants to enter when using a cyclic CAS acquires the sync blocks; CAS cycle used to release the lock when exiting sync blocks.

2.4 Summary

Guess you like

Origin www.cnblogs.com/mYunYu/p/12450324.html