Concurrent Programming 1: An Overview of Thread Safety

Table of contents

1. What is thread safety?

2. Atomicity of operations: avoiding race conditions

3. Lock mechanism: built-in lock and reentrant

4. How to use locks to protect state?

5. Activity and performance issues in the synchronization mechanism


        The core of writing thread-safe code is to manage state access operations , especially access to shared (Shared) and mutable (Mutable) states . //Core: manage shared and mutable state

        The state of an object refers to the data stored in state variables. State variables can be instance or member variables of a class.

        Whether an object needs to be thread-safe depends on whether it is accessed by multiple threads. To make an object thread-safe, a synchronization mechanism is required to coordinate access to the mutable state of the object . Failure to achieve synergy can lead to data corruption and other undesired consequences. //Achieving thread safety of objects through synchronization

The main synchronization mechanism         in Java is the keyword synchronized , which provides an exclusive locking method. In addition, it also includes volatile type variables , explicit locks (Explicit Lock) and atomic variables . //Resolve visibility and atomicity (sequence) in synchronization

1. What is thread safety?

        A class is said to be thread-safe if it always behaves correctly when accessed by multiple threads. //what you see is what you know

public class StatelessFactorizer extends GenericServlet implements Servlet {
    public void service(ServletRequest req, ServletResponse resp) {
        //1-从req中获取值
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        //2-编码并响应
        encodeIntoResponse(resp, factors);
    }
}

        The StatelessFactorizer above is stateless: it neither contains any member variables, nor does it contain any references to member variables in other classes . Temporary state during computation exists only in local variables on the thread stack and can only be accessed by the executing thread. Since the behavior of a thread accessing a stateless object does not affect the correctness of the object's operations in other threads, stateless objects are thread-safe . //There is no thread safety problem without sharing

2. Atomicity of operations: avoiding race conditions

        Suppose we want to increment a Hit Counter to count the number of requests processed. The easiest way is to add a member variable of long type in the Servlet, and add 1 to this value every time a request is processed, the code is as follows:

//存在线程安全问题
public class UnsafeCountingFactorizer extends GenericServlet implements Servlet {
    //计数器
    private long count = 0;
    public long getCount() {
        return count;
    }

    public void service(ServletRequest req, ServletResponse resp) {
        //1-从req中获取值
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        ++count;
        //2-编码并响应
        encodeIntoResponse(resp, factors);
    }
}

        We all know that although the increment operation ++count appears to be just one operation, this operation is not atomic. In fact, it consists of three separate operations: reading the value of count, incrementing it, and writing the result to count. This is a  "read-modify-write" sequence of operations , and the resulting state depends on the previous state . //The instruction is non-atomic, if the steps are messed up, the result will be messed up

There are multiple race conditions         in the UnsafeCountingFactorizer above , making the results unreliable. The most common race condition is the "Check-Then-Act" operation , which uses a possible invalid observation to determine the next action . // If the precondition is false, then the result of the argument is generally false

        For example, first observe that a certain condition is true (such as file X is not present), and then perform the action (create file X) based on this observation result, but in fact, between you observe this result and start creating the file, Observations may become invalid (another thread created file X in the meantime), causing various problems (unexpected abnormal data overwritten, file broken, etc.). //Singleton problem

        To ensure thread safety and avoid race conditions , operations such as "check-before-execute" and "read-modify-write" must be atomic .

public class CountingFactorizer extends GenericServlet implements Servlet {
    //使用原子类
    private final AtomicLong count = new AtomicLong(0);
    public long getCount() { return count.get(); }

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        count.incrementAndGet();
        encodeIntoResponse(resp, factors);
    }
}

        In practical situations, existing thread-safe objects (such as AcomicLong) should be used as much as possible to manage the state of the class . Compared with non-thread-safe objects, it is easier to judge the state of thread-safe objects and their state transitions, so it is easier to maintain and verify thread safety. // Only valid for security of a single variable

3. Lock mechanism: built-in lock and reentrant

        When faced with multiple variables, the atomic class does not guarantee that the synchronization mechanism is effective:

//存在线程安全问题
public class UnsafeCachingFactorizer extends GenericServlet implements Servlet {
    //原子类变量1
    private final AtomicReference<BigInteger> lastNumber = new AtomicReference<BigInteger>();
    //原子类变量2
    private final AtomicReference<BigInteger[]> lastFactors = new AtomicReference<BigInteger[]>();

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        //两个变量不能保证同时获取或者同时设置
        if (i.equals(lastNumber.get())) //获取变量1的值
            encodeIntoResponse(resp, lastFactors.get()); //获取变量2的值
        else {
            BigInteger[] factors = factor(i);
            lastNumber.set(i); //设置变量1的值
            lastFactors.set(factors); //设置变量2的值
            encodeIntoResponse(resp, factors);
        }
    }
}

        At this point, it is necessary to introduce a lock mechanism to ensure thread synchronization.

        Java provides a built-in locking mechanism to support atomicity: Synchronized Blocks . Each Java object can be used as a lock to achieve synchronization. These locks are called built-in locks (Intrinsic Lock) or monitor locks (Monitor Lock) . //It is the so-called monitor process, Synchronized is too common to introduce too much

        The problem of Synchronized: Using synchronized code blocks, it is easy to overly protect the code. Although the security problem is solved, it brings performance problems . //Coarse-grained and fine-grained problems of locks

        Built-in locks are reentrant , so if a thread tries to acquire a lock it already holds, the request will succeed. "Heavy man" means that the granularity of the operation of acquiring the lock is "line", not "call" . // Non-reentrancy will cause self-blocking problems

        One way to achieve reentrancy is to associate each lock with an acquisition count and an owner thread . When the count value is 0, the lock is considered not held by any thread. When a thread requests a lock that is not held, the JVM will note the lock holder and set the acquisition count to 1. If the same thread acquires the lock again, the counter value will be incremented, and the counter will be decremented accordingly when the thread exits the synchronized code block. When the count reaches 0, the lock will be released. //The realization principle of reentrant lock

4. How to use locks to protect state?

        Locks allow the code they protect to be accessed serially, so exclusive access to shared state can be achieved through locks.

        Here are some suggestions for using locks correctly:

        (1) If synchronization is used to coordinate access to a variable, then synchronization needs to be used in all places where the variable is accessed and manipulated . Also, the same lock is used in all places where variables are accessed and manipulated. //Reading and writing of shared variables must be locked

        The reason why every object has a built-in lock is just to avoid explicitly creating lock objects. You can construct your own locking protocols or synchronization strategies to secure access to shared state, and use them throughout your program.

        (2) Every shared and mutable variable should be protected by only one lock , so that the maintainer knows which lock it is.

        (3) For each invariant condition involving multiple variables, all variables involved need to be protected by the same lock .

5. Activity and performance issues in the synchronization mechanism

        Just think, if synchronization can avoid race condition problems, why not use the keyword synchronized in every method declaration?

        In fact, if synchronized is used indiscriminately, it may lead to excessive synchronization in the program . Furthermore, just making each method a synchronized method, such as Vector, is not enough to ensure that compound operations on Vector are atomic:

//非原子操作
if (!vector.contains(element))
    vector.add(element);

        In addition, making each method a synchronous method may lead to Liveness or Performance issues .

        For the following code, if the synchronization method in SynchronizedFactorizer is used, the execution performance of the code will be very poor. //You can't directly lock the method. Although thread safety is achieved, it pays too much performance price

//线程安全
public class SynchronizedFactorizer extends GenericServlet implements Servlet {
    //成员变量
    private BigInteger lastNumber;
    private BigInteger[] lastFactors;

    //直接锁方法,存在性能问题
    public synchronized void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);

        if (i.equals(lastNumber))
            encodeIntoResponse(resp, lastFactors);
        else {
            BigInteger[] factors = factor(i);
            lastNumber = i;
            lastFactors = factors;
            encodeIntoResponse(resp, factors);
        }
    }
}

        Lock optimization idea: narrow the scope of the synchronization code block , so as to ensure the concurrency of Servlet and maintain thread safety at the same time. Make sure that the synchronized code block is not too small, and do not split operations that should be atomic into multiple synchronized code blocks. Longer operations that do not affect shared state should be separated from synchronized code blocks so that other threads can access shared state during the execution of these operations. //Minimize the coarse-grained lock as much as possible, and strip the code with long execution time

        The restructured CachedFactorizer achieves a balance between simplicity and concurrency. code show as below:

//线程安全
public class CachedFactorizer extends GenericServlet implements Servlet {
    //共享变量
    private BigInteger   lastNumber;
    private BigInteger[] lastFactors;
    //命中计数器
    private long         hits;
    //cache命中计数器
    private long         cacheHits;

    public synchronized long getHits() {
        return hits;
    }

    public synchronized double getCacheHitRatio() {
        return (double) cacheHits / (double) hits;
    }

    public void service(ServletRequest req, ServletResponse resp) {
        //1-从req获取值
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = null; 
        synchronized (this) { //同步代码块1,对变量进行操作
            ++hits;
            if (i.equals(lastNumber)) {
                ++cacheHits;
                factors = lastFactors.clone();
            }
        }
        if (factors == null) {
            factors = factor(i);   //局部变量,不需要进行同步
            synchronized (this) {  //同步代码块2,对变量进行操作
                lastNumber  = i;
                lastFactors = factors.clone();
            }
        }
        //2-响应:把执行时间长的代码进行剥离
        encodeIntoResponse(resp, factors);
    }

    void encodeIntoResponse(ServletResponse resp, BigInteger[] factors) {
    }

    BigInteger extractFromRequest(ServletRequest req) {
        return new BigInteger("7");
    }

    BigInteger[] factor(BigInteger i) {
        // Doesn't really factor
        return new BigInteger[]{i};
    }
}

        The hit counter of type AtomicLong is no longer used in CachedFactorizer, but a variable of type long is used. Of course, you can also use the AtomicLong type, which is very useful for implementing atomic operations on a single variable. But here, since we have already used synchronized code blocks to construct atomic operations, using two different synchronization mechanisms will not only cause confusion, but also will not bring any benefits in performance or security, so we will not use it here atomic variable . //In the same class, only one synchronization mechanism should be used to make the code simple and easy to understand.

        Determining a reasonable size for a synchronized code block requires a trade-off between various design requirements, including safety (which must be met), simplicity, and performance. Sometimes there's a conflict between simplicity and performance, but there's usually a reasonable balance between the two . Often, there is a tradeoff between simplicity and performance. When implementing a synchronization strategy, one must not blindly sacrifice simplicity for performance (this may compromise security). //Strive to achieve a balance between security and performance

        Whether it is performing a computationally intensive operation, or performing a potentially blocking operation, if the lock is held for too long, it will cause liveness or performance problems. Therefore, you must not hold locks when performing long computations or operations that may not complete quickly (for example, network I/O or console I/O) . // Code that executes for a long time should not hold locks

        At this point, the full text ends here.

おすすめ

転載: blog.csdn.net/swadian2008/article/details/125164826