Java JUC concurrent package (5) - thread safety

All knowledge points of "Java Concurrency in Practice" (java concurrent programming practice) will be recorded here~

At the heart of writing thread-safe code is managing state access operations, especially shared and mutable state. The state of an object refers to the data stored in state variables such as instances or static fields. The state of an object may also include the domains of other dependent objects. For example, the state of HashMap is not only stored in the HashMap object itself, but also in many Map.Entry objects.

"Shared": The variable can be accessed by multiple threads at the same time;
"Mutable": The value of the variable can change during its lifetime.

When multiple threads access a state variable and one of them performs a write operation, a synchronization mechanism must be employed to coordinate the access of the threads to the variable. The main synchronization mechanism in java is the keyword synchronized , which provides an exclusive way of locking, but "synchronization" also includes variables of type volatile , explicit locks , and atomic variables .

The program will fail if proper synchronization is not used when multiple threads access the same mutable state variable. There are three ways to fix this problem:

Do not share this state variable between threads
Modify a state variable to an immutable variable
Use synchronization when accessing state variables

Note: Is a thread-safe program entirely composed of thread-safe classes? The answer is no, a program composed entirely of thread-safe classes is not necessarily thread-safe, and thread-safe classes can also contain non-thread-safe classes.

1. What is thread safety

Often, the need for thread safety is not derived from the direct use of threads, but rather from frameworks like servlets.

Example: a stateless servlet (factoring)

public class StatelessFactorizer implements Servlet {
    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        encodeIntoResponse(resp,factors);

The above code is: thread safe

In order to have a good understanding of thread safety, we must first know what is a stateful object and what is a stateless object.
- Stateful: with data storage function. Stateful objects are objects with instance variables that can hold data and are not thread-safe. That is, objects with data members
- stateless: it is an operation and cannot save data. A stateless object is an object that has no instance variables. It cannot save data, it is an immutable class, and it is thread-safe. i.e. objects with only methods and no data members. Or there are data members, but data members are just readable and immutable objects

Alright, now we're back to the above example. Like most servlets, the StatelessFactorizer is stateless: it contains neither any fields nor any references to fields in other classes. Temporary state during computation exists only in local variables on the thread stack and can only be accessed by the executing thread. A thread accessing a StatelessFactorizer does not affect the computation results of another thread accessing the same StatelessFactorizer, because the two threads do not share state, as if they were accessing different instances. Because the behavior of threads accessing stateless objects does not affect the correctness of operations in other threads, stateless objects must be thread-safe .

Most servlets are stateless, and thread safety becomes an issue only when the servlet needs to save some information while processing a request.

Summarize:

Stateless objects must be thread-safe
Constants are always thread-safe because only read operations exist
Creating a new instance before each method call is thread-safe because shared memory is not accessed
Local variables are thread-safe because each time a method is executed, a local variable is created in a separate space, which is not a shared resource. Local variables include method parameter variables and in-method variables.

2. Atomicity

Example: Servlet that counts processed requests without synchronization

public class StatelessFactorizer implements Servlet {
        private long count = 0;
        public long getCount() { return count;}

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);

        ++count;

        encodeIntoResponse(resp,factors);

The above code is: thread unsafe

Although the incrementing operation ++count is a compact syntax, this operation is not atomic, so it does not execute as an indivisible operation. In fact, it consists of three separate operations: read the value of count; add 1 to the value; write the result of the calculation to count.

This is the case in a web-based service where returning the same value in multiple calls can cause serious data integrity problems if the counter is used to generate a sequence of values or a unique object identifier. This situation leads to a very important concept: race conditions

Race condition: this type of incorrect result due to improper execution timing

race condition

The most common type of race condition is the "check before act" operation, which uses a predictor of possible failure to determine the next action .
For example: first observe that a condition is true (eg file X does not exist), then take the corresponding action (create file X) based on this observation, but in fact, between you observing this result and starting to create the file, Observations can become invalid (another thread created file X in the meantime), causing various problems (unexpected exceptions, data overwritten, file corrupted, etc.).
The above example type of race condition is called ""check then execute"".

The nature of most static conditions - making a judgment or performing a calculation based on a potentially invalid observation.

Example: Race condition in lazy initialization

A common use case for check-before-execute is lazy initialization. The purpose of lazy initialization is to delay the initialization of an object until it is actually used, while ensuring that it is only initialized once.

Example: Race conditions in lazy initialization (actually a type of lazy loading in the singleton pattern)

public class LazyInitRace {
    private ExpensiveObject instance = null;

    public ExpensiveObject getInstance() {
        if(instance == null)
            instance = new ExpensiveObject();
        return instance;
    }
}

The above code is: thread unsafe

LazyInitRace contains a race condition that could break the correctness of this class. Assume that thread A and thread B execute LazyInitRace at the same time. A sees that the instance is empty, so it creates a new instance of ExpensiveObject. B also needs to determine whether the instance is empty. Whether the instance is empty at this time depends on the unpredictable timing. Including the scheduling method of the thread, and how long it takes for A to initialize the ExpensiveObjetc and set the instance. If instance is null when B checks, then you may get different results when calling getInstance twice, even though getInstance is normally thought to return the same instance.

Race conditions do not always produce errors and require some kind of inappropriate execution timing. However, race conditions can also cause serious problems. Assuming that LazyInitRace is used to initialize an application-wide registry, if different instances are returned in multiple calls, either part of the registration information will be lost, or multiple behaviors will present inconsistent views of the same set of registered objects.

Compound operation

Atomic: Assuming there are two operations A and B, if from the results of the thread executing A, when another thread executes B, either all B is executed, or B is not executed at all, then A and B are relative to each other. is atomic.
Atomic operation: An atomic operation is one that is performed atomically for all operations that access the same state (including the operation itself).
Compound operations: Contains a set of operations that must be performed atomically to ensure thread safety.

Change the counter variable above to use an existing thread-safe class from the atomic package for thread safety

Example: Use a variable of type AtomicLong to count the number of processed requests (AtomicLong is a thread-safe class that replaces integers of type long)

public class CountingFactorizer implements Servlet {
    private final AtomicLong count = new AtmoicLong(0);

    public long getCount() { return count.get();}

    public void service(ServletRequest req, ServletResponse resp ) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        count.incrementAndGet();
        encodeIntoResponse(resp, factors);

The above code is: thread safe

A number of atomic variable classes are included in the JUCatomic package for implementing atomic transition states on numeric and object references.

locking mechanism

Imagine the following situation: you want to factorize the incoming value, and cache the latest calculation result. When two consecutive requests factorize the same value, you can directly use the previous calculation result without redemption. virtual computing. To implement this caching strategy, two states are saved: the value of the most recently performed factorization and the result of the factorization

Example: Cache the result of its most recent computation without sufficient atomicity guarantees

public class UnsafeCachingFactorizer implements Servlet { 
    private final AtomicReference<BigInteger> lastNumber = new AtomicReference<BigInteger>();
    private final AtomicReference<BigInteger[]> lastFactors = new AtomicReference<BigInteger[]>();

    public void service(ServletRequest req, Serv;etResponse resp) {
        BigInteger i = extractFromRequest(req);
        if(i.equals(lastNumber.get()))
            encodeIntoRespinse(resp,lastFactors.get());
        else {
            BigInteger[] factors = factor(i);
            lastNumber.set(i);
            lastFactors.set(factors);
            encodeIntoRespinse(resp,factors);
    }
}

The above code is: thread unsafe

In some execution timings, UnsafeCachingFactorizer may break this indeform condition. In the case of atomic references, although each call to the set method is atomic, there is still no way to update lastNumber and lastFactors at the same time. If only one of the variables is modified, other threads will find that the invariant condition is broken between the modification operations. At the same time, there is no guarantee that both values will be fetched at the same time: while thread A is fetching these two values, thread B may modify them, so that thread A will also find that the invariance condition is broken.

Built-in lock

Java provides a built-in locking mechanism to support atomicity: synchronized code blocks (Synchronized Block), modified by the Synchronized keyword.
A synchronized block of code consists of two parts: the object reference to the lock; the block of code protected by this lock.
The lock for a synchronized block of code is the object on which the method call is made. The Synchronized method of the static method takes the class object as the lock.

synchronized （this） {
    //同步代码
}

Every java object can be used as a lock that implements synchronization, this lock is called a built-in lock (equivalent to a mutex), or a monitor lock. The thread automatically acquires the lock before entering the code block, and automatically releases the lock when it exits the synchronized code block. The only way to acquire a built-in lock is to enter a synchronized block or method protected by this lock.

Java's built-in lock is equivalent to a mutex, which means that at most one thread can hold such a lock. When thread A tries to acquire a lock held by thread B, thread A must wait or block until thread B releases the lock. If B never releases the lock, then A will wait forever

reentrant

I think reentrancy should only be used for parent classes and subclasses, so as to avoid problems when subclasses rewrite + call parent class methods

When a thread requests a lock held by another thread, the requesting thread blocks. However, since built-in locks are reentrant, if a thread attempts to acquire a lock that is already held by itself, the request will succeed. One way to implement reentrancy is to associate an acquisition count with an owner thread for each lock. When the count value is 0, the lock is considered not to be held by any thread. When a thread requests a lock that is not held, the JVM will note the holder of the lock and set the acquisition count to 1. If the same thread acquires the lock again, the counter is incremented, and when the thread exits the synchronized block, the counter is decremented accordingly. When the count value is 0, the lock will be released.

Example: This code will deadlock if the built-in lock is not reentrant

public class Widget {
    public synchronized void doSomething(){
        //代码...
    }
}

public class LoggingWidget extends Widget{
    public synchronized void doSomething(){
        super.doSomething();
    }
}

The above code is: thread safe

In the above code: the subclass rewrites the synchronized method of the parent class, and then calls the synchronized method corresponding to the parent class. If there is no reentrant lock at this time, then this code will cause a deadlock. Because the doSomething methods in the Widget class and the LoggingWidget class are both synchronized methods, each doSomething method acquires the lock on the Widget before it is executed. However, if the built-in lock is not reentrant, then calling super.doSomething won't be able to acquire the Widget's lock because the lock is already held, and the thread will stall forever, waiting for a lock that can never be acquired.