Remember the use of Spring Data JPA pitfalls: about caching and snapshots.

When fixing a bug in the project in the group recently, it was found that the bug was caused by improper use of Spring Data JPA. After successfully repairing this bug, because I don't know much about Spring Data JPA, I plan to write a blog about the relevant information I consulted in the process of solving the bug. The content of the blog is mainly for beginners and the content is simple.
First simulate the process of bug generation. The logic of the following code may be a bit inconsistent with the logic of the code we normally write, but it is important to understand the cause of the bug through the code:

@Service
public class UserService {
    
    

    @Autowired
    private UserRepository userRepository;

    @Transactional
    public void updateUser(String name) {
    
    
        User user = userRepository.findById(1).get();
        
        System.out.println("user name: " + user.getName());
        System.out.println("user age: " + user.getAge());
        System.out.println("name will be updated to " + name);
        
        userRepository.updateUserName(name, 1);
        
        User user1 = userRepository.findById(1).get();
        
        System.out.println("user1 age: " + user1.getAge());
        System.out.println("age will be updated to " + 18);
        user1.setAge(18);
        userRepository.save(user1);
    }
}

@Repository
public interface UserRepository extends JpaRepository<User, Integer> {
    
    

    @Query(value = "update User set name = ?1 where id = ?2 ")
    @Modifying
    void updateUserName(String name, Integer id);
}

UserServiceThe updateUser(String name)original envisioned function of the UserRepository.findById()method is: first call the method to find out the record with id 1 in the User table, and then call UserRepository.updateUserName(String name, Integer id)to modify the value of the name column in the record with id 1 in the User table according to the passed parameter name, and then call the UserRepository.findById()method Query the record with id 1 in the User table after the name is changed, and modify the value of the age column of the record to 18. But the result after the method is executed is a bit unexpected:
Insert picture description here

we can find that although the record with id 1 in the User table is successfully modified to 18, the name is still jack. Why? Before explaining this phenomenon, you need to understand some concepts about Spring Data JPA:

Level 1 cache

Spring Data JPA cache is used when 自定义Repositorythe find(), or findxx()when the recording method to query, the first query to the database will then query results will be stored in memory as a cache, and then queries the back directly to the same recording buffer The result is returned, no more querying the database.
Insert picture description here In updateUser()the log of the execution of the above method, we can see that only one selectstatement was executed , so User user1 = userRepository.findById(1).get();this statement did not perform a database query. UserRepository.updateUserName(name, 1)User1 is actually user. The result of this code execution is to modify the id of the User table in the database to 1. The name value in the record, and the name attribute value of the User object in the cache is still "jack".

flush()

flush()方法Synchronize the status information of all modified entities in the cache to the database. When using the save method to update an entity queried from the database within a transaction, the update statement will not be executed , but the snapshot function will be used to synchronize the modified entity state information in the cache to the In the database, if you want to synchronize the modified entity state information to the database before the transaction is committed, you must save()manually call the flush()method after calling the method to synchronize the modified entity to the database, or use the saveAndFlush()method to save the modified entity (in fact, the saveAndFlush()method is After calling the save()method, call the flush()method to synchronize the data).

Snapshot

In addition to the first level cache, Spring Data JPA also has a snapshot area. When the query results are placed in the first level cache, a copy of the data will be copied into the snapshot area at the same time. Spring Data JPA uses the snapshot area and the data in the cache Whether it is consistent to determine whether the data has been modified after being queried from the database.
In the above example, when performing User user = userRepository.findById(1).get();the this code, a buffer zone, and snapshots are saved while a User instance, as shown below:
Insert picture description here
When the method been performed user1.setAge(18);, the buffer state information and the snapshot instances User region as follows:

when When the transaction is committed, in order to keep the database data synchronized, Hibernate will clean up the first-level cache and determine whether the objects in the first-level cache are consistent with the objects in the snapshot according to the value of the primary key field. If the properties of the two objects change, execute The update statement synchronizes the cached content to the database and updates the snapshot; if they are consistent, the update statement is not executed. So in the log of the above code execution, we can see that when the updateUser()method execution ends (the updateUser()method uses the @Transactional annotation, the transaction will be submitted after the method execution), it will pass according to the attribute value of the user in the cache area (name is "jack", age is 18) to modify the corresponding database record (update statement is printed), causing UserRepository.updateUserName(name, 1)the effect of the code execution to be overwritten.

If you modify UserServicethe updateUser()method in:

    @Transactional
    public void updateUser(String name) {
    
    
        User user = userRepository.findById(1).get();

        System.out.println("user name: " + user.getName());
        System.out.println("user age: " + user.getAge());
        System.out.println("name will be updated to " + name);

        user.setName(name);
        userRepository.save(user);

        User user1 = userRepository.findById(1).get();

        System.out.println("user1 age: " + user1.getAge());
        System.out.println("age will be updated to " + 18);
        user1.setAge(18);
        userRepository.save(user1);
    }

Insert picture description here

It can be seen that both name and age have been successfully modified to "tom" and 18, this is because we set the name of the user that was queried for the first time to "jack", and user1 and user are the same object. So user1.setAge(18)the name and age of user1 are "tom" and 18 respectively after execution . At the same time, it can be found that in the above code execution log, only an update statement is printed when the method is executed. This also proves that the save()method does not take effect, but JPA uses the snapshot function to update the data in the database after the transaction is submitted. (You can remove the two save method calls, and the result is still the same).

If we userRepository.save(user);modify the above code to the userRepository.saveAndFlush(user);post, we can see that the method userRepository.saveAndFlush(user);prints out the update statement after execution , as shown in the following figure:
Insert picture description here
But it should be noted that even if you use saveAndFlush()the modified entity information to synchronize to the database, but if in the transaction When an abnormal transaction is rolled back during the execution of the method before submission, the data in the database will be rolled back accordingly.

Now that you know where the problem is, how to solve it?

Solution: Set the value of the clearAutomatically attribute in the @Modifying annotation in the custom update statement to true.

Setting the clearAutomaticallyproperty to true means that the first-level cache will be emptied after the custom update statement is User user1 = userRepository.findById(1).get();executed. This statement must be re-queryed in the database when it is executed to ensure that the name value in the object user1 is updated The value "tom". The code and execution log are as follows:

@Service
public class UserService {
    
    

    @Autowired
    private UserRepository userRepository;

    @Transactional
    public void updateUser(String name) {
    
    
        User user = userRepository.findById(1).get();

        System.out.println("user name: " + user.getName());
        System.out.println("user age: " + user.getAge());
        System.out.println("name will be updated to " + name);
        userRepository.updateUserName(name, 1);
        User user1 = userRepository.findById(1).get();

        System.out.println("user1 age: " + user1.getAge());
        System.out.println("age will be updated to " + 18);
        user1.setAge(18);
    }
}

@Repository
public interface UserRepository extends JpaRepository<User, Integer> {
    
    

    @Query(value = "update User set name = ?1 where id = ?2 ")
    @Modifying(clearAutomatically = true)
    void updateUserName(String name, Integer id);
}

Insert picture description here

Summary
When using Spring Data JPA for update operations, pay attention to the use of transactions and data synchronization between the buffer and the database.
When a custom Repositoryclass to inherit JpaRepository<T, ID>, you will find that when you call a custom Repositoryof savethe time method, the actual implementation is JpaRepository<T, ID>a subclass SimpleJpaRepository<T, ID>of the savemethod, the method @Transactionalannotated isolation level is the default value Propagation.REQUIRED, so if you servicemethods layer When the caller savemethod saves the state information of the modified entity object to the database,

If servicethe method fails @Transactionalto open a transaction, the savemethod will start a transaction. When the savemethod is executed and the transaction is committed, the object information modified by the SQL statement will be synchronized to the database.
If servicethe method is @Transactionalto open a transaction, savethe execution will not save the data to the database (no SQL statement is executed), but will synchronize the data to the database through the snapshot function after the transaction is committed.

flush()The method is used to synchronize the data in the buffer to the database. In JPA, when a transaction is submitted, JPA will automatically call the flush()method to synchronize the data to the database and clear the cache. However, it should be noted that before executing the custom SQL statement, if the data in the cache is inconsistent with the data in the snapshot, because the content of the executed statement is not known, JPA will automatically call in order to keep the database and the cached data consistent. flush()Method to synchronize the data in the cache to the database. Take the following code as an example:

@Service
public class UserService {
    
    

    @Autowired
    private UserRepository userRepository;

    @Transactional
    public void updateUser(String name) {
    
    
        User user = userRepository.findById(1).get();
        System.out.println("user age is " + user.getAge());
        user.setAge(18);
        System.out.println("user age change to 18");
        userRepository.updateUserName(name, 1);
    }
}

@Repository
public interface UserRepository extends JpaRepository<User, Integer> {
    
    

    @Query(value = "update User set name = ?1 where id = ?2 ")
    @Modifying
    void updateUserName(String name, Integer id);
}

The code execution log is as follows: It
Insert picture description here
can be seen that before userRepository.updateUserName(name, 1);this code is executed , because the age value of the user object in the cache has been modified, Spring Data JPA flush()updates the object information in the cache to the database through a method.

Spring Data JPA encapsulates many practical methods for programmers. Programmers can easily use Spring Data JPA to write data access layer code, but sometimes, the framework does too much for us, but it becomes a disadvantage, because when we When the mechanism of the framework is not understood, the methods provided by the framework will be used incorrectly, which will lead to errors, which are sometimes difficult to find through debug. So when using a framework, you should have a proper understanding of its mechanism.

PS: I haven't written a blog for a long time. It took a day to write this blog. Sure enough, writing this kind of thing takes time to practice. I blame myself for being too lazy.

Remember the use of Spring Data JPA pitfalls: about caching and snapshots.

Solution: Set the value of the clearAutomatically attribute in the @Modifying annotation in the custom update statement to true.

Guess you like