"Insights" caused by locks in JUC programming

"Insights" caused by locks in JUC programming

Early in the morning, I was wondering what the lock was for? When should a lock be used? I have thought a lot, and all the following statements are my thinking process. Everyone is welcome to discuss and make corrections.

1. What is JUC programming?

First of all, you have to know what juc programming is. JUC programming refers to Java concurrent programming, where "JUC" is the abbreviation of "Java Util Concurrency".

Now that it has been mentioned that JUC programming is concurrent programming, everyone cannot help but think about it. What is the difference between parallelism and concurrency? Why isn't it called parallel programming?

I think parallelism refers to the execution of multiple tasks at the same time. Concurrency refers to multiple threads seizing the same resource at the same time. The so-called concurrency refers to the number of concurrent tasks or requests that your application can handle at the same time. With an application Our JUC programming is based on the consideration of program concurrency.

2. When is it necessary to use locks in JUC programming?

In Java concurrent programming, you need to use locks (such as Java's synchronizedkeyword or java.util.concurrentthe lock class provided in the package) to manage multiple threads' access to shared resources to ensure thread safety. Here are some common situations where locks are needed:

  1. Multiple threads modify shared variables : When multiple threads access and modify shared variables at the same time, locks need to be used to protect these variables to prevent data races and inconsistent states from occurring.

  2. Critical section protection : Certain critical sections of code, called critical sections, need to be accessible by only one thread at a time. Locks can be used to implement mutually exclusive access to critical sections, ensuring that only one thread can execute code in the critical section at a time.

  3. Coordinated multi-threaded operations : In a multi-threaded environment, it is sometimes necessary to coordinate the operations of multiple threads to ensure that they execute in the desired order. Locks can be used to implement this coordination, for example using wait()the and notify()or await()and signal()methods.

  4. Producer-Consumer Problem : In the producer-consumer problem, the producer thread generates data and puts it into a shared queue, while the consumer thread takes the data out of the queue. Locks can be used to control concurrent access to queues to prevent data races and queue overflows.

  5. Avoid deadlock : Locks can also be used to avoid deadlock situations. By carefully managing the order in which locks are acquired and released, the risk of deadlock can be reduced.

  6. Protect immutable objects : In a multi-threaded environment, immutable objects are generally thread-safe. But if multiple threads want to modify the state of an immutable object, you need to use a lock to protect it to ensure thread safety.

  7. Inter-thread communication : In multi-threaded applications, communication or synchronization may be required between threads. Locks and condition variables can be used for efficient communication and synchronization between threads.

In short, locks are the protection of shared resources. Without shared resources, there would be a ball lock.

3. What is JMM?

Now that shared resources have been mentioned, I also want to mention the JMM I learned. JMM is the Java Memory Model , or JMM for short. Remember it is the Java memory model, not the JVM (java virtual machine). They are like the relationship between Lei Feng and Leifeng Pagoda. Are they related? It doesn't matter.

3.1 JMM memory model

Oh, you all know that JMM is a Java memory model, so you have to know what it looks like. The following is a rough model diagram that I understand.

CPU Register

CPU Register is also a CPU register. CPU registers are integrated inside the CPU, and the efficiency of performing operations on registers is several orders of magnitude higher than on main memory.

CPU Cache Memory

CPU Cache Memory is the CPU cache. Compared with registers, it can usually also become the L2 level 2 cache. Compared with the hard disk reading speed, the efficiency of memory reading is very high, but it is still orders of magnitude different from that of the CPU. Therefore, a multi-level cache is introduced between the CPU and the main memory for the purpose of buffering.

Main Memory

Main Memory is the main memory, which is much larger than the L1 and L2 caches. Note: Some high-end machines also have L3 level 3 cache.

3.2 Cache consistency issue

The three in the middle of the picture above are not dog bowls, they are caches! Since it is mentioned above that the execution efficiency of the CPU is much higher than that of the main memory, there must be a solution, which is to use the cache as a buffer and give it a buffer area . The idea is similar to other cache products, I think. Queue peak shaving, including the message queue, is almost the same idea, this kind of bird thing (a dog's head saves one's life). Now that the problem of efficiency has been solved, there is still one problem that has not been solved, and that is cache consistency. It's almost useless to draw a knife, why should we be accurate! So I went to find out more about the cache consistency solution.

See below:

Did you find anything? There is just one more agreement. Since you don’t agree with me, but we need to communicate, then we will sign an agreement to resolve it/

3.3 Processor optimization and instruction reordering

The above is to add a cache to optimize the speed mismatch between the CPU and the main memory, but we can further optimize the processor, because in order to maximize the full utilization of the computing unit inside the processor, the processor will perform input code Executing processing out of order is processor optimization.

Processor reordering generally includes the following three types:

  1. Write reordering : The execution order of write operations is inconsistent with the order in the program. This can cause one thread's writes to execute before another thread's writes, thereby destroying data consistency.
  2. Read reordering : The execution order of read operations is inconsistent with the order in the program. This can result in one thread reading stale data because its read operation precedes another thread's write operation.
  3. Memory Barriers : The processor may insert memory barriers to control the order of instruction execution to ensure that certain operations are not reordered.

Here is a simple Java code example that demonstrates the problems that processor reordering can cause:

public class ReorderingExample {
    
    
    private static int x = 0;
    private static int y = 0;
    private static int a = 0;
    private static int b = 0;

}

public static void main(String[] args) {
    
    
    Thread thread1 = new Thread(() -> {
    
    
        a = 1;
        x = b;
    });

    Thread thread2 = new Thread(() -> {
    
    
        b = 1;
        y = a;
    });

    thread1.start();
    thread2.start();

    try {
    
    
        thread1.join(); //
thread1.join()是一个线程同步方法,它的作用是等待thread1线程执行完毕后再继续执行当前线程(通常是主线程)的代码。
        thread2.join();
    } catch (InterruptedException e) {
    
    
        e.printStackTrace();
    }

    System.out.println("x = " + x);
    System.out.println("y = " + y);
}

In this example, two threads modified the values ​​of aand b, and the values ​​of xand ywere set to band respectively a. Theoretically, the values ​​of xsum yshould be 0 and 1. However, due to processor reordering, the actual output may be x = 0sum y = 0, which violates our expectations.

To solve this problem, you can use volatilemechanisms such as keywords or memory barriers to tell the processor not to reorder. This ensures that the code is executed in the order the programmer intended.

3.4 Shared memory issues

Okay, okay, let’s think about it a little further. What we wanted to talk about earlier is actually the problem of shared memory. So how do we share memory? Let me modify the picture I drew.

as follows:

The specific logic is also very troublesome, and I couldn’t explain it clearly at the moment. Suppose there are four variables A, B, C, and D. In order to achieve sharing, they are all stored in the main memory. However, since they are shared, it means that multiple threads are allowed to perform common operations and modifications. But the modification here is usually to first create a copy of the variable in your own local memory and store it in your own local memory, and then write the copy to the main memory after modifying it. Generally, three characteristics of operations in JMM are involved:

  • Atomicity: Needless to say, atomicity. Multiple operations cannot be divided and must be treated as a whole to prevent other threads from taking advantage of the opportunity.
  • Visibility: means that when a thread modifies the value of a shared variable, other threads can immediately see the modification.
  • Orderliness: Orderliness means that the processor should not be clever and reorder, ensuring that the code is executed in the order we want.

4. Use of distributed locks

First, think about what locks there are in JUC. How to add locks to ensure our concurrency?

  • Biased Locking : Biased Locking is an optimization method for locking operations. It assumes that in most cases the lock is always acquired multiple times by the same thread, so after the first acquisition of the lock, the lock is marked "biased" towards the thread that acquired it. This means that when the thread requests the lock again later, there is no need to compete for the lock, which improves performance. Only when other threads compete for the lock, the biased lock will be upgraded to a lightweight lock or a heavyweight lock.
  • Lightweight Lock : A lightweight lock is a lock used to reduce the cost of lock operations during multi-thread competition. It is based on the CAS (Compare-And-Swap) operation and attempts to atomically upgrade the lock mark from the biased state to the lightweight lock state. If lock competition is fierce, lightweight locks will be upgraded to heavyweight locks.
  • Heavyweight Lock : Heavyweight lock is a traditional lock implementation that uses the operating system's mutex to control multi-threaded access to shared resources. When multiple threads compete for a lock, one thread will obtain the lock and the other threads will enter a blocked state.
  • Fair Lock : Fair Lock ensures that threads obtain locks in the order in which they request the lock, that is, first come, first served. Although fair locking increases the overhead of contention, it ensures that thread starvation does not occur.
  • Pessimistic Locking : Pessimistic locking assumes that conflicts will occur in the case of concurrent access, so the lock is always acquired before accessing shared resources to ensure resource exclusivity. Synchronized and ReentrantLock are implementations of pessimistic locking.
  • Optimistic Locking : Optimistic locking assumes that no conflicts will occur in the case of concurrent access, so it will not block threads, but will check whether the data version number and other identifiers have changed during the update operation. Optimistic locking is usually implemented using version numbers or timestamps. For example, the CAS operation is an implementation of optimistic locking.
  • ReentrantLock : Reentrant lock, similar to synchronized, but provides more functions, such as interruptible, timed, etc.
  • ReentrantReadWriteLock : Reentrant read-write lock, supports multiple threads to read shared data at the same time, but only allows one thread to write data.
  • StampedLock : A more flexible read-write lock that supports optimistic read, pessimistic read, and write operations.
  • Semaphore : Semaphore, used to control the number of threads accessing a resource at the same time.
  • CountDownLatch : Countdown counter, used to wait for multiple threads to complete a task.
  • CyclicBarrier : Cyclic barrier, used to wait for multiple threads to reach a certain state before continuing execution together.
  • Phaser : Phased barrier, a more advanced barrier that can be divided into multiple phases, and each phase can have a different number of participants.
  • Exchanger : Synchronization point for exchanging data between two threads.
  • LockSupport : Thread blocking tool, which can be used to block and wake up threads.

Is it a lot? Hard to understand? Then learn JUC programming, brothers, I am also a rookie, and I also want to become a master of concurrent programming.

Now I want to talk about a problem I learned:

4.1 Why do concurrent programming generally use distributed locks instead of local locks?

Locks are often used in concurrent programming to protect shared resources or critical sections to ensure that operations between multiple threads or processes are orderly. In a stand-alone environment, local locks (such as the synchronized keyword in Java or ReentrantLock) are usually sufficient and can be used to synchronize access between threads. However, in distributed systems, using local locks may encounter some problems, so distributed locks are more commonly used. Here are some reasons:

  1. Multi-node coordination : In a distributed system, threads on different nodes may need to coordinate and synchronize operations. Local locks are only valid within a single JVM process and cannot be used for coordination between multiple JVMs. Therefore, distributed locks allow concurrent operations to be coordinated on different nodes.

  2. Multiple processes or services : Distributed systems often consist of multiple processes or services, which may run on different servers. Local locks are only valid within a single process and cannot be used for synchronization between different processes.

  3. Resource competition : Distributed systems usually have multiple nodes accessing shared resources (such as databases, files, caches, etc.) at the same time. Local locks cannot handle resource competition issues between different nodes.

  4. Fault tolerance : Distributed locks are usually designed to be fault tolerant. Even if a node or lock service fails, it will not cause deadlock or data inconsistency.

  5. Horizontal scaling : Distributed locks can easily scale horizontally to handle high concurrency situations, while the performance of local locks is limited to a single process.

Although local locks have the advantages of high performance and low overhead in a stand-alone environment, in distributed systems, concurrent access to multiple nodes and resource competition usually need to be considered, and in this case it is more appropriate to use distributed locks. Distributed locks are usually implemented based on distributed storage or coordination services, such as ZooKeeper, Redis, etc. They provide a reliable way to manage locks and ensure thread safety and synchronization in a distributed environment. Therefore, choosing local locks or distributed locks depends on specific application scenarios and requirements.

So there is no need to learn local lock? Don't think too much, brothers. I don't understand local locks and don't understand how to play distributed locks (again, I'm trying to save my life). Let's talk about what I learned about Redis locks.

4.2 redis distributed lock

Let me first talk about a practical problem I encountered while doing the project.

Each person develops Redis on the cloud desktop and shares a database. The framework used by the company also has problems and does not implement concurrency management. When running scheduled tasks, data will be executed multiple times. .

If multiple computers share a database, and the applications on these computers run on different processes or servers, then usually local locks will not work effectively because local locks are only effective within a single process. So, consider redis distributed lock.

  1. Introducing Redis client dependencies Introduce Redis client dependencies into the dependency management of the project, such as using Spring Data Redis or Jedis.

  2. Acquire the lock Before the scheduled task is executed, try to acquire the lock. This can be achieved using the Redis SETNX command (SET if Not eXists). If the return value is 1, it means that the lock has been obtained and scheduled tasks can be performed; if the return value is 0, it means that the lock is held by other threads or processes and needs to wait.

  3. Execute the logic of the scheduled task after acquiring the lock.

  4. When releasing the lock, after the task is executed, the lock needs to be released manually so that other threads or processes can acquire the lock. Locks can be deleted using the Redis DEL command.

    The following is a sample code that demonstrates how to use Redis distributed locking to solve the problem of multiple calls to scheduled tasks:

    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    public void scheduledTask() {
          
          
        // 尝试获取锁
        boolean lockAcquired = redisTemplate.opsForValue().setIfAbsent("task_lock", "locked", 60, TimeUnit.SECONDS);
        if (lockAcquired) {
          
          
            try {
          
          
                // 执行定时任务的逻辑
                // ...
            } finally {
          
          
                // 释放锁
                redisTemplate.delete("task_lock");
            }
        } else {
          
          
            // 未获取到锁,直接返回
            return;
        }
    }
    

    In the above example, we used RedisTemplate to operate Redis. First, try to acquire the lock through setIfAbsentthe method and set a lock expiration time of 60 seconds. If the return value is true, it means that the lock is acquired and the logic of the scheduled task is executed; if the return value is false, it means that the lock is already held by another thread or process, and it returns directly.

    After the scheduled task is executed, deletemanually release the lock through the method.

    Please note that when using Redis distributed locks, you need to consider the lock expiration time and lock release under abnormal circumstances to ensure the stability and correctness of the system.

    Key Points: 1. In fact, such a Redis distributed lock will cause a program deadlock if a power outage occurs before setting the lock expiration time, unless the atomicity of setting the lock (setting the expiration time of the lock at the same time) and releasing the lock (generally When using Lua scripts, if you are afraid of accidentally deleting someone else's lock, you will set a uuid when acquiring the lock to judge whether it is your own lock. However, when you are ready to release the lock after judging that it is your own, the network fluctuates and you do not have time. Release, if someone else seizes the lock, someone else's lock will be released later)

    2. Is the final code for locking above perfect? Imagine a scenario like this. If the expiration time is 30S, thread A has not finished executing for more than 30S, but it has automatically expired. At this time, thread B will get the lock again, causing two threads to hold the lock at the same time. This problem can be attributed to the "renewal" problem, that is, when A is not completed, the contract should expire and be renewed, and the lock can be released only after the execution is completed. How to do it? We can let the thread that acquires the lock start a daemon thread to "renew" the lock that is about to expire. In fact, deleting non-own locks when unlocking is also a "renewal" issue. It is recommended to use the distributed lock provided by Redission. It will have a watchdog mechanism to automatically set the cache expiration time.

    ps: synchronized is a local lock, which can only lock the current thread. In distributed services, distributed locks must be used.

Okay, that’s all I’ve thought of. Brothers, I’ve written it here. The more I learn, the better I feel, and the more I want to learn, falling into a cycle. No more words, let’s continue studying.

Guess you like

Origin blog.csdn.net/qq_45925197/article/details/132776270