MyBatis08 - "General Source Code Guide: Detailed Explanation of MyBatis Source Code" notes - JDBC and cache packages

This series of articles is my notes and summary from the book "General Source Code Guide: Detailed Explanation of MyBatis Source Code
" . This book is based on MyBatis-3.5.2 version. The author of the book is Brother Yi . The link is Brother Yi's Weibo in CSDN. But among all the articles I read, there was only one that briefly introduced this book. It doesn’t reveal too much about the charm of the book. Next, I will record my learning summary. If the author thinks that I have infringed the copyright, please contact me to delete it. Thanks again to Brother Yi for providing learning materials. This explanation will accompany the entire series of articles. Respect the originality . I have purchased the revised book on WeChat Reading.
Copyright statement: This article is an original article by CSDN blogger "Architect Yi Ge" and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement when reprinting.
Original link: https://blog.csdn.net/onlinedct/article/details/107306041

1.jdbc package

The jdbc package is a very independent package in MyBatis. This package provides the ability to execute database operation statements and script running capabilities.
The jdbc package looks very simple. Except for the SelectBuilder and SqlBuilder classes that will be abandoned, there are only six classes left. However, there are many places worth pondering in the source code of the entire package.
We first give the following two questions, and then continue the source code analysis with these two questions.

  • Many method names in the AbstractSQL class are uppercase. For example, UPDATE should be written as update. Why does this happen?
  • All classes in the entire jdbc package have not been referenced externally, so what is the significance of this package?

1.1 AbstractSQL class and SQL class

The AbstractSQL class is an abstract class, which contains an abstract method getSelf. The SQL class implements this abstract method as a subclass of the AbstractSQL class.
The AbstractSQL class contains two static inner classes: SafeAppendable class and SQLStatement class.

  1. SafeAppendable is a splicer, and its append method can also realize the string splicing function. Its function implementation is relatively simple.
  // Appendable接口:StringBuilder/StringBuffer等都是它的子类,表征具有可拼接性,字符等可以拼接在后面。
  // 一个安全的拼接器,安全是因为外部调用不到。而内部又通过实例化调用
  private static class SafeAppendable {
    
    
    // 主串
    private final Appendable a;
    // 主串是否为空
    private boolean empty = true;

    public SafeAppendable(Appendable a) {
    
    
      super();
      this.a = a;
    }

    /**
     * 向主串拼接一段字符串
     * @param s 被拼接的字符串
     * @return SafeAppendable内部类自身
     */
    public SafeAppendable append(CharSequence s) {
    
    
      try {
    
    
        // 要拼接的串长度不为零,则拼完后主串也不是空了
        if (empty && s.length() > 0) {
    
    
          empty = false;
        }
        // 拼接
        a.append(s);
      } catch (IOException e) {
    
    
        throw new RuntimeException(e);
      }
      return this;
    }

    /**
     * 判断当前主串是否为空
     * @return 当前主串是否为空
     */
    public boolean isEmpty() {
    
    
      return empty;
    }
  }
  1. The SQLStatement internal class can completely express a SQL statement. These attributes completely describe the various fragments of information required by a SQL statement.
    // 当前语句的语句类型
    StatementType statementType;

    // 语句片段信息
    List<String> sets = new ArrayList<>();
    List<String> select = new ArrayList<>();
    List<String> tables = new ArrayList<>();
    List<String> join = new ArrayList<>();
    List<String> innerJoin = new ArrayList<>();
    List<String> outerJoin = new ArrayList<>();
    List<String> leftOuterJoin = new ArrayList<>();
    List<String> rightOuterJoin = new ArrayList<>();
    List<String> where = new ArrayList<>();
    List<String> having = new ArrayList<>();
    List<String> groupBy = new ArrayList<>();
    List<String> orderBy = new ArrayList<>();
    List<String> lastList = new ArrayList<>();
    List<String> columns = new ArrayList<>();
    List<List<String>> valuesList = new ArrayList<>();
    // 表征是否去重,该字段仅仅对于SELECT操作有效,它决定是SELECT还是SELECT DISTINCT
    boolean distinct;
    // 结果偏移量
    String offset;
    // 结果总数约束
    String limit;
    // 结果约束策略
    LimitingRowsStrategy limitingRowsStrategy = LimitingRowsStrategy.NOP;

And there is indeed a sql method in SQLStatement, which can call the corresponding sub-method according to different statement types to splice the statement fragment information into a complete SQL statement.

    /**
     * 根据语句类型,调用不同的语句拼接器拼接SQL语句
     * @param a 起始字符串
     * @return 拼接完成后的结果
     */
    public String sql(Appendable a) {
    
    
      SafeAppendable builder = new SafeAppendable(a);
      if (statementType == null) {
    
    
        return null;
      }
      String answer;
      switch (statementType) {
    
    
        case DELETE:
          answer = deleteSQL(builder);
          break;
        case INSERT:
          answer = insertSQL(builder);
          break;
        case SELECT:
          answer = selectSQL(builder);
          break;
        case UPDATE:
          answer = updateSQL(builder);
          break;
        default:
          answer = null;
      }
      return answer;
    }
    // SELECT 操作的拼接
    // 因为SELECT操作(其他操作也是)的字符拼接是固定的,因此只要给定各个keyword的list即可按照顺序完成拼接

    /**
     * 将SQL语句片段信息拼接为一个完整的SELECT语句
     * @param builder 语句拼接器
     * @return 拼接完成的SQL语句字符串
     */
    private String selectSQL(SafeAppendable builder) {
    
    
      if (distinct) {
    
    
        sqlClause(builder, "SELECT DISTINCT", select, "", "", ", ");
      } else {
    
    
        sqlClause(builder, "SELECT", select, "", "", ", ");
      }

      sqlClause(builder, "FROM", tables, "", "", ", ");
      joins(builder); // JOIN操作相对复杂,调用单独的joins子方法进行操作
      sqlClause(builder, "WHERE", where, "(", ")", " AND ");
      sqlClause(builder, "GROUP BY", groupBy, "", "", ", ");
      sqlClause(builder, "HAVING", having, "(", ")", " AND ");
      sqlClause(builder, "ORDER BY", orderBy, "", "", ", ");
      limitingRowsStrategy.appendClause(builder, offset, limit);
      return builder.toString();
    }
  1. AbstractSQL class
    With the two internal classes SQLStatement and SafeAppendable, the external class AbstractSQL can realize the splicing of SQL statements without relying on other classes.
public String queryUsersBySchoolName() {
    
    
    return new SQL()
            .SELECT("*")
            .FROM("user")
            .WHERE("schoolName = #{schoolName}")
            .toString();
}

Based on the SQL class, a subclass of the AbstractSQL class, the work of splicing string fragments into SQL statements is performed.

This answers a question we asked at the beginning of this chapter: There are a large number of methods named in all capital letters in the AbstractSQL class, such as UPDATE, SET, etc. This is to take care of the user's usage habits. Because we usually capitalize keywords such as UPDATE and SET when writing SQL statements.

After knowing the structure of the AbstractSQL class, you can analyze the use of the entire AbstractSQL class.

  • First, the user uses statements like "SELECT("*").FROM("user").WHERE("schoolName=#{schoolName}")" to set SQL statement fragments. These fragments are saved in the SQLStatement internal class in AbstractSQL. ArrayList.
  • When the user calls the toString() operation, the splicing of SQL fragments is triggered. In the SQLStatement internal class, it is spliced ​​into a complete SQL statement according to certain rules.

The template for the entire splicing operation of the splicing function is fixed.

    /**
     * 将SQL语句片段信息拼接为一个完整的INSERT语句
     * @param builder 语句拼接器
     * @return 拼接完成的SQL语句字符串
     */
    private String insertSQL(SafeAppendable builder) {
    
    
      sqlClause(builder, "INSERT INTO", tables, "", "", "");
      sqlClause(builder, "", columns, "(", ")", ", ");
      for (int i = 0; i < valuesList.size(); i++) {
    
    
        sqlClause(builder, i > 0 ? "," : "VALUES", valuesList.get(i), "(", ")", ", ");
      }
      return builder.toString();
    }

Does this mean that users can disrupt the order of statements when constructing SQL statements? And indeed it is. The execution of the code will not be affected at all if the order is disrupted.

public String queryUsersBySchoolName() {
    
    
    return new SQL()
            .SELECT("*")
            .WHERE("schoolName = #{schoolName}")
            .FROM("user")
            .toString();
}
  1. The SQL class is a subclass of AbstractSQL and only overrides one of the getSelf methods.
  @Override
  public SQL getSelf() {
    
    
    return this;
  }

So why does AbstractSQL retain an abstract method and then create a SQL class to implement it? What's the point of all this?
Separating AbstractSQL as an abstract method allows us to inherit AbstractSQL and implement other subclasses, ensuring that the AbstractSQL class is easier to extend.
For example: we can create an ExplainSQL class that inherits the AbstractSQL class. Then add a behavior to the ExplainSQL class, for example, add the EXPLAIN prefix before all operations to analyze SQL running performance.
Insert image description here

1.2 SqlRunner

The SqlRunner class is a tool class provided by MyBatis that can directly execute SQL statements. You can directly call SqlRunner to execute SQL statements.

// SqlRunner类的使用
String sql = "SELECT * FROM user WHERE age = ?;";
SqlRunner sqlRunner = new SqlRunner(connection);
List<Map<String, Object>> result = sqlRunner.selectAll(sql,15);
System.out.println(result);

// SqlRunner类的使用,email变量值为null
sql = "UPDATE user SET email = ?  WHERE id = 2;";
Integer out = sqlRunner.update(sql,Null.STRING);
System.out.println(out);

One thing to note when using SqlRunner is that if the parameter is null, you need to reference the enumeration value in the enumeration type Null. This is because the enumeration type in Null contains type information and type processor information.

public enum Null {
    
    
  BOOLEAN(new BooleanTypeHandler(), JdbcType.BOOLEAN),
  BYTE(new ByteTypeHandler(), JdbcType.TINYINT),
  SHORT(new ShortTypeHandler(), JdbcType.SMALLINT),
  INTEGER(new IntegerTypeHandler(), JdbcType.INTEGER),
  LONG(new LongTypeHandler(), JdbcType.BIGINT),
  FLOAT(new FloatTypeHandler(), JdbcType.FLOAT),
  DOUBLE(new DoubleTypeHandler(), JdbcType.DOUBLE),
  BIGDECIMAL(new BigDecimalTypeHandler(), JdbcType.DECIMAL),
  STRING(new StringTypeHandler(), JdbcType.VARCHAR),
  CLOB(new ClobTypeHandler(), JdbcType.CLOB),
  LONGVARCHAR(new ClobTypeHandler(), JdbcType.LONGVARCHAR),
  BYTEARRAY(new ByteArrayTypeHandler(), JdbcType.LONGVARBINARY),
  BLOB(new BlobTypeHandler(), JdbcType.BLOB),
  LONGVARBINARY(new BlobTypeHandler(), JdbcType.LONGVARBINARY),
  OBJECT(new ObjectTypeHandler(), JdbcType.OTHER),
  OTHER(new ObjectTypeHandler(), JdbcType.OTHER),
  TIMESTAMP(new DateTypeHandler(), JdbcType.TIMESTAMP),
  DATE(new DateOnlyTypeHandler(), JdbcType.DATE),
  TIME(new TimeOnlyTypeHandler(), JdbcType.TIME),
  SQLTIMESTAMP(new SqlTimestampTypeHandler(), JdbcType.TIMESTAMP),
  SQLDATE(new SqlDateTypeHandler(), JdbcType.DATE),
  SQLTIME(new SqlTimeTypeHandler(), JdbcType.TIME);
  // 参数的类型处理器
  private TypeHandler<?> typeHandler;
  // 参数的JDBC类型
  private JdbcType jdbcType;

Using Null enumeration values ​​for parameter setting ensures that although the parameter value is null, the type of the parameter is clear. Having clear parameter types is required in the setNull function of PreparedStatement. When the SqlRunner class assigns a null value to the parameter, it finally calls the following setNull function:

void setNull(int parameterIndex, int sqlType) throws SQLException;

The related methods in SqlRunner are relatively simple:

  /**
   * 执行多个数据的查询操作,即SELECT操作
   * @param sql 要查询的SQL语句
   * @param args SQL语句的参数
   * @return 查询结果
   * @throws SQLException
   */
  public List<Map<String, Object>> selectAll(String sql, Object... args) throws SQLException {
    
    
    PreparedStatement ps = connection.prepareStatement(sql);
    try {
    
    
      setParameters(ps, args);
      ResultSet rs = ps.executeQuery();
      return getResults(rs);
    } finally {
    
    
      try {
    
    
        ps.close();
      } catch (SQLException e) {
    
    
        //ignore
      }
    }
  }

After obtaining the query results, SqlRunner also uses the result processing function getResults to further process the results. This function is responsible for extracting the results returned by the database operation and returning them in the form of a list.

  /**
   * 处理数据库操作的返回结果
   * @param rs 返回的结果
   * @return 处理后的结果列表
   * @throws SQLException
   */
  private List<Map<String, Object>> getResults(ResultSet rs) throws SQLException {
    
    
    try {
    
    
      List<Map<String, Object>> list = new ArrayList<>();
      // 返回结果的字段名列表,按照字段顺序排列
      List<String> columns = new ArrayList<>();
      // 返回结果的类型处理器列表,按照字段顺序排列
      List<TypeHandler<?>> typeHandlers = new ArrayList<>();
      // 获取返回结果的表信息、字段信息等
      ResultSetMetaData rsmd = rs.getMetaData();
      for (int i = 0, n = rsmd.getColumnCount(); i < n; i++) {
    
    
        // 记录字段名
        columns.add(rsmd.getColumnLabel(i + 1));
        // 记录字段的对应类型处理器
        try {
    
    
          Class<?> type = Resources.classForName(rsmd.getColumnClassName(i + 1));
          TypeHandler<?> typeHandler = typeHandlerRegistry.getTypeHandler(type);
          if (typeHandler == null) {
    
    
            typeHandler = typeHandlerRegistry.getTypeHandler(Object.class);
          }
          typeHandlers.add(typeHandler);
        } catch (Exception e) {
    
    
          // 默认的类型处理器是Object处理器
          typeHandlers.add(typeHandlerRegistry.getTypeHandler(Object.class));
        }
      }
      // 循环处理结果
      while (rs.next()) {
    
    
        Map<String, Object> row = new HashMap<>();
        for (int i = 0, n = columns.size(); i < n; i++) {
    
    
          // 字段名
          String name = columns.get(i);
          // 对应处理器
          TypeHandler<?> handler = typeHandlers.get(i);
          // 放入结果中,key为字段名大写,value为取出的结果值
          row.put(name.toUpperCase(Locale.ENGLISH), handler.getResult(rs, name));
        }
        list.add(row);
      }
      return list;
    } finally {
    
    
      if (rs != null) {
    
    
        try {
    
    
          rs.close();
        } catch (Exception e) {
    
    
          // ignore
        }
      }
    }
  }

The SqlRunner class can accept SQL statements and parameters, and then perform database operations. However, SqlRunner cannot complete complex operations such as mapping objects and SQL parameters, and mapping SQL results and objects.

1.3 ScriptRunner class

ScriptRunner is a tool class provided by MyBatis for directly executing SQL scripts, which allows developers to directly submit the entire script file to MyBatis for execution. For example:

// ScriptRunner类的使用
ScriptRunner scriptRunner = new ScriptRunner(connection);
scriptRunner.runScript(Resources.getResourceAsReader("demoScript.sql"));

The code directly executes all the SQL scripts in demoScript.sql.

ScriptRunner handles SQL scripts and does not involve variable assignment issues, which is simpler than SqlRunner. ScriptRunner also provides two modes: full script execution and line-by-line execution:

  /**
   * 执行脚本
   * @param reader 脚本
   */
  public void runScript(Reader reader) {
    
    
    // 设置为自动提交
    setAutoCommit();
    try {
    
    
      if (sendFullScript) {
    
    
        // 全脚本执行
        executeFullScript(reader);
      } else {
    
    
        // 逐行执行
        executeLineByLine(reader);
      }
    } finally {
    
    
      rollbackConnection();
    }
  }

  /**
   * 全脚本执行
   * @param reader 脚本
   */
  private void executeFullScript(Reader reader) {
    
    
    // 脚本全文
    StringBuilder script = new StringBuilder();
    try {
    
    
      BufferedReader lineReader = new BufferedReader(reader);
      String line;
      while ((line = lineReader.readLine()) != null) {
    
    
        // 逐行读入脚本全文
        script.append(line);
        script.append(LINE_SEPARATOR);
      }
      // 拼接为一条命令
      String command = script.toString();
      println(command);
      // 执行命令
      executeStatement(command);
      // 如果没有启用自动提交,则进行提交操作(脚本中可能修改了自动提交设置)
      commitConnection();
    } catch (Exception e) {
    
    
      String message = "Error executing: " + script + ".  Cause: " + e;
      printlnError(message);
      throw new RuntimeSqlException(message, e);
    }
  }

  private void executeLineByLine(Reader reader) {
    
    
    StringBuilder command = new StringBuilder();
    try {
    
    
      BufferedReader lineReader = new BufferedReader(reader);
      String line;
      // 逐行依次执行
      while ((line = lineReader.readLine()) != null) {
    
    
        handleLine(command, line);
      }
      // 提交执行
      commitConnection();
      // 是否存在多余的行
      checkForMissingLineTerminator(command);
    } catch (Exception e) {
    
    
      String message = "Error executing: " + command + ".  Cause: " + e;
      printlnError(message);
      throw new RuntimeSqlException(message, e);
    }
  }

1.4Independence of jdbc package

Now, we still have a question in our mind: All classes in the entire jdbc package have not been referenced externally, so what is the meaning of this package?

That's because the jdbc package is a functionally independent tool package provided by MyBatis, which is left for users to use instead of being called by MyBatis.
For example, in many cases, users can choose to splice SQL statements by themselves or use the tools of the jdbc package to splice SQL statements.

//自行拼接SQL语句
public String queryUsersBySchoolName() {
    
    
    return " select * user where schoolName = #{schoolName}"
}
//借助 jdbc包的工具拼接SQL语句
public String queryUsersBySchoolName() {
    
    
    return new SQL()
            .SELECT("*")
            .FROM("user")
            .WHERE("schoolName = #{schoolName}")
            .toString();
}

The SqlRunner class and ScriptRunner class provide users with the ability to execute SQL statements and scripts. In some cases, we need to perform some setting operations on the database (such as running some DDL operations). In this case, there is no need to provide ORM functions through MyBatis, so the SqlRunner class and ScriptRunner class will be very good choices.

In fact, this package has another feature: it has minimal dependence on the outside world. Except for the SqlRunner class, no other classes in the jdbc package depend on classes outside the jdbc package. Even RuntimeSqlException has become the only exception class that does not inherit the PersistenceException class in the exception package.
This design makes the jdbc package highly independent and can be easily disassembled and used.

In source code reading, the functions of most classes can be deduced from the dependencies within the project. But you will also encounter some classes that are similar to the classes in the jdbc package, and they have minimal coupling with other classes in the project. Determining the functions of these classes requires us to have a clearer understanding of the use of the project.

2.cache package

MyBatis may handle tens of thousands of database query requests per second, and these requests may be repeated. Caching can significantly reduce the number of database queries and improve the performance of the entire MyBatis.

MyBatis cache allows each database query request to be filtered by the cache system first, and the physical database will be queried only if the cache is not hit. The cache package is the provider of MyBatis caching capabilities. However, it should be noted that the cache package only provides caching capabilities and does not involve the use of specific caching functions. Therefore, at the end of this chapter, we will read and summarize the source code related to the caching mechanism in each package from the perspective of caching function.

2.1 Background knowledge

2.1.1Reference level of java objects

During the running of the Java program, the JVM will automatically help us perform garbage collection operations to prevent useless objects from occupying memory space.
This process is mainly divided into two steps:

  1. Find all garbage objects;
  2. Clean up the found junk objects.
    We focus on the first step, how to find garbage objects. The key issue here is how to determine whether an object is a garbage object. The main methods to determine whether an object is a garbage object include reference counting and reachability analysis . The JVM uses reachability analysis. The reachability analysis method means that the JVM will start from the garbage collected root object (Garbage Collection Root, referred to as GC Root) and continue to traverse along the reference relationships between objects. In the end, the objects that can be traversed are useful objects, and the objects that cannot be traversed are garbage objects.

Insert image description here

In the above figure, object c no longer refers to object d, so object d and object f cannot be reached through GC Root, and objects d and f become garbage objects. One thing to note is that in the figure we only draw one GC Root, but there are actually multiple GC Roots in the JVM. When an object cannot be traversed by any GC Root, it is a garbage object.

However, the reference relationship shown in the above figure has limitations. Imagine that there is a large object that is not necessary. We hope that the system can retain it when the memory is not tight, and release it to give up memory space for more important objects when the memory is tight. What should you do at this time?

Java has taken this situation into consideration. There are not only two situations of "reference" and "no reference" in Java's references, but four situations:

  • Strong reference (SoftReference): What we usually call a reference. As long as an object can be strongly referenced by GC Root, it is not a garbage object. When there is insufficient memory, the JVM will throw an OutOfMemoryError instead of clearing the strongly referenced object.
  • SoftReference: If an object can only be softly referenced by GC Root, it means that it is not necessary. When there is insufficient memory space, the JVM will recycle the object.
  • Weak Reference: If an object can only be weakly referenced by GC Root, it means it is redundant. As long as the JVM finds it, it will recycle the object regardless of whether the memory space is sufficient. Compared with soft references, weak references have lower reference strength, and the objects referenced by weak references exist for a relatively shorter time.
  • Phantom Reference: If an object can only be referenced by GC Root, it is the same as when it cannot be referenced by GC Root. Therefore, as far as the garbage collection process is concerned, virtual references are as if they do not exist and do not determine the life cycle of the object. Virtual references are mainly used to track the activity of objects being recycled by the garbage collector.
private static void simpleRef() {
    
    
    // 通过等号直接建立的引用都是强引用
    User user = new User();

    // 通过SoftReference建立的引用是软引用
    SoftReference<User> softRefUser =new SoftReference<>(new User());

    // 通过WeakReference建立的引用是弱引用
    WeakReference<User> weakRefUser = new WeakReference<>(new User());
}

2.1.2ReferenceQueue class

If an object has only soft or weak references, it may be garbage collected by the JVM at any time. So it becomes Schrödinger's cat, and until we read it, we have no way of knowing whether it still exists.
However, sometimes we need to know when an object with a soft reference or a weak reference is recycled in order to perform some subsequent processing. The ReferenceQueue class provides such functionality. The ReferenceQueue itself is a list, which we can pass in when creating a soft reference or weak reference wrapper object. In this way, when the JVM recycles the wrapped object, its wrapping class will be added to the ReferenceQueue class.

We can understand these concepts through an example that may not be appropriate. Suppose our target object is ice cream, and the packaging object of soft reference or weak reference is ice cream stick. Although we hold the ice cream stick, the ice cream on the ice cream stick may melt and fall to the ground at any time (or it may be eaten by us, but it is gone, which is equivalent to being destroyed by the JVM). ReferenceQueue is the small wooden bucket where we collect ice cream sticks. When we find that the ice cream on a certain ice cream stick disappears, we will put the ice cream stick into the small wooden bucket. In this way, we can know which ice cream has disappeared just by looking at the small wooden barrel.

public class User {
    
    
    private long id;

    public User() {
    
    
    }

    public User(long id) {
    
    
        this.id = id;
    }

    @Override
    public String toString() {
    
    
        return "User:" + id;
    }
}

private static void refWithReferenceQueue() {
    
    
   // 创建ReferenceQueue
   // 即我们的小木桶
   ReferenceQueue<Object> referenceQueue = new ReferenceQueue<>();

   // 用来存储弱引用的目标对象
   // 即我们用来抓带有雪糕的雪糕棒的手
   List<WeakReference> weakRefUserList = new ArrayList<>();
   // 创建大量的弱引用对象,交给weakRefUserList引用
   // 即创建许多带有雪糕的雪糕棒,并且拿到手里
   for (int i =0 ; i< 1000000; i++) {
    
     // 创建这么多的目的是为了让内存空间紧张
       // 创建弱引用对象,并在此过程中传入ReferenceQueue
       // 即将雪糕放到雪糕棒上,并且确定用来收集雪糕棒的小木桶
       WeakReference<User> weakReference = new WeakReference(new User(Math.round(Math.random() * 1000)),referenceQueue);
       // 引用弱引用对象
       // 即抓起这个带有雪糕的雪糕棒
       weakRefUserList.add(weakReference);
   }

   WeakReference weakReference;
   Integer count = 0;

   // 处理被回收的弱引用
   // 即通过检查小木桶,处理没有了雪糕的雪糕棒
   while ((weakReference = (WeakReference) referenceQueue.poll()) != null) {
    
    
       // 虽然弱引用存在,但是引用的目标对象已经为空
       // 即虽然雪糕棒在木桶中,但是雪糕棒上却没有了雪糕
       System.out.println("JVM 清理了:" + weakReference + ", 从WeakReference中取出对象值为:" + weakReference.get());
       count ++;
   }

   // 被回收的弱引用总数
   // 即小木桶中雪糕棒的数目,也是融化的雪糕的数目
   System.out.println("weakReference中的元素数目为:" + count);

   // 在弱引用的目标对象不被清理时,可以引用到目标对象
   // 即在雪糕还没有融化掉到地上时,雪糕棒上是有雪糕的
   System.out.println("在不被清理的情况下,可以从WeakReference中取出对象值为:" +
           new WeakReference(new User(Math.round(Math.random() * 1000)),referenceQueue).get());
}

/*
JVM 清理了:java.lang.ref.WeakReference@4b1d273a, 从WeakReference中取出对象值为:null
JVM 清理了:java.lang.ref.WeakReference@6d26e756, 从WeakReference中取出对象值为:null
JVM 清理了:java.lang.ref.WeakReference@1015009f, 从WeakReference中取出对象值为:null
weakReference中的元素数目为:230759
在不被清理的情况下,可以从WeakReference中取出对象值为:User:561
*/

The packaging objects WeakReference (equivalent to ice cream sticks) of the cleaned User object (equivalent to ice cream) are written into the ReferenceQueue (equivalent to the small barrel). It is precisely because the User objects they packaged have been cleaned, so they are taken out from the ReferenceQueue. The result must be null.

2.2 Cache package structure and Cache interface

The cache package is a typical application case of the decorator pattern. The implementation class is stored in the imple sub-package, and many decorator classes are stored in the decorators sub-package. The Cache interface is the common interface between implementation classes and decorator classes.

In the subclass of the Cache interface, there is only one implementation class, but there are ten decorator classes. By decorating an implementation class with different decorators, the implementation class can have different functions.
Insert image description here

public interface Cache {
    
    
  /**
   * 获取缓存id
   * @return 缓存id
   */
  String getId();
  /**
   * 向缓存写入一条信息
   * @param key 信息的键
   * @param value 信息的值
   */
  void putObject(Object key, Object value);
  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  Object getObject(Object key);
  /**
   * 从缓存中删除一条信息
   * @param key 信息的键
   * @return 原来的信息值
   */
  Object removeObject(Object key);
  /**
   * 清空缓存
   */
  void clear();
  /**
   * 读取缓存中信息的数目
   * @return 信息的数目
   */
  int getSize();
  /**
   * 获取读写锁,该方法已经废弃
   * @return 读写锁
   */
  default ReadWriteLock getReadWriteLock() {
    
    
    return null;
  }

2.3 Cache keys

2.3.1 Principle of cache keys

MyBatis filters numerous database query operations per second, which places high demands on the design of MyBatis cache keys. MyBatis cache key must meet the following points.

  • No collision: It must be ensured that the keys generated by two different query requests are inconsistent. This is the most important requirement that must be met. Otherwise, the query operation will hit the wrong cache and return wrong results.
  • Efficient comparisons: Each cache query operation may trigger multiple comparisons between keys, so the operation must be efficient.
  • Efficient generation: The cache key needs to be generated before each cache query and write operation, so this operation must also be efficient.

In programming, we often use simple types such as numbers and strings as keys. However, such keys are prone to collisions. In order to prevent collisions, the key generation mechanism needs to be designed to be very complex, which in turn reduces the key comparison efficiency and generation efficiency. Therefore, accuracy and efficiency often restrict each other.

In order to solve the above problems, MyBatis designed a CacheKey class as a cache key. The entire CacheKey design is not complicated, but very delicate.

  // 乘数,用来计算hashcode时使用
  private final int multiplier;
  // 哈希值,整个CacheKey的哈希值。如果两个CacheKey该值不同,则两个CacheKey一定不同
  private int hashcode;
  // 求和校验值,整个CacheKey的求和校验值。如果两个CacheKey该值不同,则两个CacheKey一定不同
  private long checksum;
  // 更新次数,整个CacheKey的更新次数
  private int count;
  // 更新历史
  private List<Object> updateList;

  /**
   * 更新CacheKey
   * @param object 此次更新的参数
   */
  public void update(Object object) {
    
    
    int baseHashCode = object == null ? 1 : ArrayUtil.hashCode(object);

    count++;
    checksum += baseHashCode;
    baseHashCode *= count;

    hashcode = multiplier * hashcode + baseHashCode;

    updateList.add(object);
  }

The update method understands the functions of the above attributes. Each update operation will cause changes in count, checksum, and hashcode values, and put the updated values ​​into updateList.

When comparing whether CacheKey objects are equal, type judgment will be performed first, and then hashcode, checksum, and count will be compared. As long as one item is different, it means that the two objects are different. The above operations are relatively simple and can be completed in a short time. If the above attributes are completely consistent, the change history updateList of the two CacheKey objects will be compared in detail. This step is relatively complicated, but it can ensure that there will never be a collision problem.

  /**
   * 比较当前对象和入参对象(通常也是CacheKey对象)是否相等
   * @param object 入参对象
   * @return 是否相等
   */
  @Override
  public boolean equals(Object object) {
    
    
    // 如果地址一样,是一个对象,肯定相等
    if (this == object) {
    
    
      return true;
    }
    // 如果入参不是CacheKey对象,肯定不相等
    if (!(object instanceof CacheKey)) {
    
    
      return false;
    }
    final CacheKey cacheKey = (CacheKey) object;
    // 依次通过hashcode、checksum、count判断。必须完全一致才相等
    if (hashcode != cacheKey.hashcode) {
    
    
      return false;
    }
    if (checksum != cacheKey.checksum) {
    
    
      return false;
    }
    if (count != cacheKey.count) {
    
    
      return false;
    }

    // 详细比较变更历史中的每次变更
    for (int i = 0; i < updateList.size(); i++) {
    
    
      Object thisObject = updateList.get(i);
      Object thatObject = cacheKey.updateList.get(i);
      if (!ArrayUtil.equals(thisObject, thatObject)) {
    
    
        return false;
      }
    }
    return true;
  }

In this way, fast comparison is achieved through the three values ​​​​of count, checksum, and hashcode, and the updateList value ensures that no collision will occur. This design strikes a good balance between accuracy and efficiency.

MyBatis also prepares a NullCacheKey, which is used as a null key. In the cache query, if a CacheKey is found to have incomplete information, a NullCacheKey object will be returned, similar to returning a null value. However, NullCacheKey is a subclass of CacheKey after all, and will not cause a null pointer exception in the subsequent processing. This design method is also worth learning from.

2.3.2 Cache implementation class

The implementation class of the Cache interface in the impl subpackage is PerpetualCache. The implementation of PerpetualCache is very simple and has only two properties:

  • id: used to uniquely identify a cache. Generally, the namespace value of the mapping file is used as the cache ID, so as to ensure that the caches of different mapping files are different.
  • cache: It is a HashMap that stores data in the form of key-value pairs.
  // Cache的id,一般为所在的namespace
  private final String id;
  // 用来存储要缓存的信息
  private Map<Object, Object> cache = new HashMap<>();

So the cache implementation class is a HashMap with an id, and there is nothing special about it.

2.3.3 Cache decorator

The implementation of the cache implementation class PerpetualCache is very simple, but more functions can be added to it through decorators. There are many decorators in the decorators sub-package. They can be divided into the following categories according to their functions:

  • Synchronization decorator: Add synchronization functions to the cache, such as the SynchronizedCache class.
  • Log decorator: Add logging functionality to the cache, such as the LoggingCache class.
  • Cleaning decorator: Add cleaning functions to the data in the cache, such as FifoCache class, LruCache class, WeakCache class, and SoftCache class.
  • Blocking decorator: Add blocking functionality to the cache, such as the BlockingCache class.
  • Scheduled cleanup decorator: Add a scheduled refresh function to the cache, such as the ScheduledCache class.
  • Serialization decorator: Add serialization function to cache, such as SerializedCache class.
  • Transaction decorator: A decorator used to support transaction operations, such as the TransactionalCache class.

2.3.3.1 Synchronous decorators

In the process of using MyBatis, multiple threads may access a cache at the same time. For example, in the mapping file of the code below, if multiple threads call the selectUsers method at the same time, the two threads will access the cache with the ID "com.github.yeecode.mybatisdemo.dao.UserDao" at the same time.

<mapper namespace="com.github.yeecode.mybatisdemo.dao.UserDao">
    <select id="selectUser" resultMap="userMapByConstructor">
      SELECT * FROM `user` WHERE `id` = #{id}
    </select>
</mapper>    

The cache implementation class PerpetualCache does not add any measures to ensure multi-thread security, which will cause multi-thread security issues.
MyBatis hands over the task of ensuring cache multi-thread safety to the SynchronizedCache decorator. The implementation of the SynchronizedCache decorator is very simple. It directly adds the synchronized keyword outside the operation method of the wrapped object, converting the method of the wrapped object into a synchronized method.

2.3.3.2 Log decorator

The purpose of adding caching for database operations is to reduce database query operations and thereby improve operating efficiency. The configuration of the cache is also very important. If the configuration is too large, memory space will be wasted. If the configuration is too small, it will not function better. Therefore, it is necessary to set the appropriate cache size based on some operating indicators.
The log decorator can add log statistics function to the cache, and the data that needs statistics is mainly the cache hit rate. The so-called cache hit rate refers to the ratio of data that can be queried in the cache during multiple accesses to the cache.

The implementation of the log decorator is very simple, that is, recording the total number of queries and the number of hits when caching queries.

  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    // 请求缓存次数+1
    requests++;
    final Object value = delegate.getObject(key);
    if (value != null) {
    
     // 命中缓存
      // 命中缓存次数+1
      hits++;
    }
    if (log.isDebugEnabled()) {
    
    
      log.debug("Cache Hit Ratio [" + getId() + "]: " + getHitRatio());
    }
    return value;
  }

2.3.3.3 Clean up decorators

Although caching can greatly improve the efficiency of data queries, it comes at the cost of consuming memory space. Cache space is always limited, so it is important to add appropriate cleaning strategies to the cache to maximize the use of this cache space. There are four cleaning decorators in the cache decorator that can complete the cache cleaning function. These four cleaning decorators also correspond to the four cache cleaning strategies of MyBatis.

  1. FifoCache decorator
    FifoCache decorator uses a first-in-first-out strategy to clean the cache. It uses the keyList attribute internally to store the writing order of cached data, and uses the size attribute to store the limit on the number of cached data. When the data in the cache reaches the limit, the FifoCache decorator will delete the data first put into the cache.
  // 被装饰对象
  private final Cache delegate;
  // 按照写入顺序保存了缓存数据的键
  private final Deque<Object> keyList;
  // 缓存空间的大小
  private int size;
  /**
   * 向缓存写入一条数据
   * @param key 数据的键
   * @param value 数据的值
   */
  @Override
  public void putObject(Object key, Object value) {
    
    
    cycleKeyList(key);
    delegate.putObject(key, value);
  }
   /**
   * 记录当前放入的数据的键,同时根据空间设置清除超出的数据
   * @param key 当前放入的数据的键
   */
  private void cycleKeyList(Object key) {
    
    
    keyList.addLast(key);
    if (keyList.size() > size) {
    
    
      Object oldestKey = keyList.removeFirst();
      delegate.removeObject(oldestKey);
    }
  }
  1. LruCache decorator
    LRU (Least Recently Used) is the least recently used algorithm. This algorithm will delete recently unused data when the number of cached data reaches the set upper limit. The LruCache decorator adds these capabilities to the cache.
  // 被装饰对象
  private final Cache delegate;
  // 使用LinkedHashMap保存的缓存数据的键
  private Map<Object, Object> keyMap;
  // 最近最少使用的数据的键
  private Object eldestKey;
 /**
   * LruCache构造方法
   * @param delegate 被装饰对象
   */
  public LruCache(Cache delegate) {
    
    
    this.delegate = delegate;
    setSize(1024);
  }
    /**
   * 设置缓存空间大小
   * @param size 缓存空间大小
   */
  public void setSize(final int size) {
    
    
    keyMap = new LinkedHashMap<Object, Object>(size, .75F, true) {
    
    
      private static final long serialVersionUID = 4267176411845948333L;

      /**
       * 每次向LinkedHashMap放入数据时触发
       * @param eldest 最久未被访问的数据
       * @return 最久未必访问的元素是否应该被删除
       */
      @Override
      protected boolean removeEldestEntry(Map.Entry<Object, Object> eldest) {
    
    
        boolean tooBig = size() > size;
        if (tooBig) {
    
    
          eldestKey = eldest.getKey();
        }
        return tooBig;
      }
    };
  }

In the constructor of the LruCache class, the setSize method is called to set the cache space size. In the setSize method, a LinkedHashMap object is created to store cache data keys, and the removeEldestEntry method of LinkedHashMap is overridden.

removeEldestEntry is a method of LinkedHashMap, which is automatically triggered every time data is put into LinkedHashMap (put method and putAll method). Its input parameter is the element that has not been visited for the longest time. LruCache will put the longest unaccessed key into the eldestKey attribute when the cache space is exceeded.

In order to delete the data that has not been used for the longest time, the LruCache class also does the following two tasks.

  • Update the sorting of keys in keyMap every time a cache query operation is performed, and rank the currently queried key to the front;
  • It writes new keys to keyMap every time a cache write operation is performed, and deletes the data that has not been accessed for the longest time when the amount of data in the current cache exceeds the set amount of data.
 /**
   * 向缓存写入一条信息
   * @param key 信息的键
   * @param value 信息的值
   */
  @Override
  public void putObject(Object key, Object value) {
    
    
    // 真正的查询操作
    delegate.putObject(key, value);
    // 向keyMap中也放入该键,并根据空间情况决定是否要删除最久未访问的数据
    cycleKeyList(key);
  }
  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    // 触及一下当前被访问的键,表明它被访问了
    keyMap.get(key);
    // 真正的查询操作
    return delegate.getObject(key);
  }
  /**
   * 向keyMap中存入当前的键,并删除最久未被访问的数据
   * @param key 当前的键
   */
  private void cycleKeyList(Object key) {
    
    
    keyMap.put(key, key);
    if (eldestKey != null) {
    
    
      delegate.removeObject(eldestKey);
      eldestKey = null;
    }
  }

The real cache data is stored in the decorated object. Although the keyMap in the LruCache class is a LinkedHashMap, the keys and values ​​stored internally are the keys of the cached data, but the values ​​of the cached data are not stored. This is because the purpose of introducing LinkedHashMap is only to use it to save the status of cached data being accessed, rather than participating in the saving of specific data.

  1. WeakCache Decorator
    The WeakCache decorator wraps cached data into weakly referenced data, allowing the JVM to clean up the cached data.
  // 强引用的对象列表
  private final Deque<Object> hardLinksToAvoidGarbageCollection;
  // 弱引用的对象列表
  private final ReferenceQueue<Object> queueOfGarbageCollectedEntries;
  // 被装饰对象
  private final Cache delegate;
  // 强引用对象的数目限制
  private int numberOfHardLinks;

The WeakCache class also prepares a hardLinksToAvoidGarbageCollection attribute to make a strong reference to the cache object, but the space provided by this attribute is limited. After being packaged by the WeakCache class, when data is stored in the cache, the weak reference wrapping class of the data is stored.

  /**
   * 向缓存写入一条信息
   * @param key 信息的键
   * @param value 信息的值
   */
  @Override
  public void putObject(Object key, Object value) {
    
    
    // 清除垃圾回收队列中的元素
    removeGarbageCollectedItems();
    // 向被装饰对象中放入的值是弱引用的句柄
    delegate.putObject(key, new WeakEntry(key, value, queueOfGarbageCollectedEntries));
  }

When data is fetched from the cache, the weak reference wrapper class of the data is also fetched. The data itself may have been cleaned up by the JVM, so you need to judge this situation when retrieving the data.

  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    Object result = null;
    // 假定被装饰对象只被该装饰器完全控制
    WeakReference<Object> weakReference = (WeakReference<Object>) delegate.getObject(key);
    if (weakReference != null) {
    
     // 取到了弱引用的句柄
      // 读取弱引用的对象
      result = weakReference.get();
      if (result == null) {
    
     // 弱引用的对象已经被清理
        // 直接删除该缓存
        delegate.removeObject(key);
      } else {
    
     // 弱引用的对象还存在
        // 将缓存的信息写入到强引用列表中,防止其被清理
        hardLinksToAvoidGarbageCollection.addFirst(result);
        if (hardLinksToAvoidGarbageCollection.size() > numberOfHardLinks) {
    
     // 强引用的对象数目超出限制
          // 从强引用的列表中删除该数据
          hardLinksToAvoidGarbageCollection.removeLast();
        }
      }
    }
    return result;
  }

The data stored in the cache is in the form of "data key: data value", and after being packaged by WeakCache, the data stored in the cache is in the form of "data key: weak reference packaging <data value>". Then when the data value of the weak reference is recycled by the JVM, the data in the cache will be in the form of "data key: weak reference wrapper <null>".
If the cached data value is recycled by the JVM, the entire cached data "data key: weak reference wrapper <null>" will be meaningless and should be cleared directly.

For this purpose, WeakCache designed the WeakEntry internal class. As a weak reference wrapper class, the WeakEntry class directly adds the key attribute and stores the key of the data in it. This attribute is a strong reference and will not be cleared away by the JVM at will.

  private static class WeakEntry extends WeakReference<Object> {
    
    
    // 该变量不会被JVM清理掉,这里存储了目标对象的键
    private final Object key;

    private WeakEntry(Object key, Object value, ReferenceQueue<Object> garbageCollectionQueue) {
    
    
      super(value, garbageCollectionQueue);
      this.key = key;
    }
  }
  1. SoftCache Decorator
    SoftCache Decorator and WeakCache Decorator are highly consistent in structure and function, except that they change from weak references to soft references.

2.3.3.4 Blocking decorators

When MyBatis receives a database query request and the corresponding query result does not exist in the cache, MyBatis will query through the database. Just imagine if MyBatis receives the exact same database query request before the database query is completed, how should it be handled? There are two common solutions:

  • Because there is no corresponding cached result in the cache, another database query request is initiated, which will cause the database to receive two identical query requests in a short period of time.
  • Although there is no corresponding cached result in the cache, a request has been made to the database, so the cache should first block the second query request. After waiting for the database query to end, return the database query results to two query requests.
    Obviously, the latter option is more reasonable.
    The blocking decorator BlockingCache provides the above functionality for caching. After using the blocking decorator to decorate the cache, the cache will temporarily block subsequent queries when receiving multiple identical query requests, and return all requests together while waiting for the database results to be returned.
    Insert image description here
    Properties of the BlockingCache class. ConcurrentHashMap is used in the locks attribute to store all cached keys and corresponding locks. In this way, the corresponding data query operation can only be performed after the corresponding lock is obtained, otherwise it will be blocked.
  // 获取锁时的运行等待时间
  private long timeout;
  // 被装饰对象
  private final Cache delegate;
  // 锁的映射表。键为缓存记录的键,值为对应的锁。
  private final ConcurrentHashMap<Object, ReentrantLock> locks;

The following code shows methods related to lock acquisition and release. It should be noted that the key of each record has a corresponding lock, so the blocking decorator locks not the entire cache, but a certain record in the cache.

  /**
   * 找出指定键的锁
   * @param key 指定的键
   * @return 该键对应的锁
   */
  private ReentrantLock getLockForKey(Object key) {
    
    
    return locks.computeIfAbsent(key, k -> new ReentrantLock());
  }

  /**
   * 获取某个键的锁
   * @param key 数据的键
   */
  private void acquireLock(Object key) {
    
    
    // 找出指定对象的锁
    Lock lock = getLockForKey(key);
    if (timeout > 0) {
    
    
      try {
    
    
        boolean acquired = lock.tryLock(timeout, TimeUnit.MILLISECONDS);
        if (!acquired) {
    
    
          throw new CacheException("Couldn't get a lock in " + timeout + " for the key " +  key + " at the cache " + delegate.getId());
        }
      } catch (InterruptedException e) {
    
    
        throw new CacheException("Got interrupted while trying to acquire lock for key " + key, e);
      }
    } else {
    
    
      // 锁住
      lock.lock();
    }
  }
  /**
   * 向缓存写入一条信息
   * @param key 信息的键
   * @param value 信息的值
   */
  @Override
  public void putObject(Object key, Object value) {
    
    
    try {
    
    
      // 向缓存中放入数据
      delegate.putObject(key, value);
    } finally {
    
    
      // 因为已经放入了数据,因此释放锁
      releaseLock(key);
    }
  }

  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    // 获取锁
    acquireLock(key);
    // 读取结果
    Object value = delegate.getObject(key);
    if (value != null) {
    
    
      // 读取到结果后释放锁
      releaseLock(key);
    }
    // 如果缓存中没有读到结果,则不会释放锁。对应的锁会在从数据库读取了结果并写入到缓存后,在putObject中释放。

    // 返回查询到的缓存结果
    return value;
  }

Cache data reading and writing methods in BlockingCache. Before reading the data in the cache, you need to obtain the lock corresponding to the data. If the corresponding data is read from the cache, the lock will be released immediately; if the corresponding data is not read from the cache, it means that the next A database query will be performed, and the lock on the data will not be released until the database query ends and the data is written to the cache.

2.3.3.5 Clean decorators regularly

When the cache's clear method is called, the data in the cache will be cleared. But this operation will not be performed automatically. The scheduled cleaning decorator ScheduledCache can clean the data in the cache according to a certain time interval, that is, calling the clear method according to a certain time interval.

  // 被装饰的对象
  private final Cache delegate;
  // 清理的时间间隔
  protected long clearInterval;
  // 上次清理的时刻
  protected long lastClear;
  /**
   * 根据清理时间间隔设置清理缓存
   * @return 是否发生了缓存清理
   */
  private boolean clearWhenStale() {
    
    
    if (System.currentTimeMillis() - lastClear > clearInterval) {
    
    
      clear();
      return true;
    }
    return false;
  }

The cleanup method clearWhenStale of the ScheduledCache class will be called in getSize, putObject, getObject, and removeObject.
We need to know that the scheduled cleanup function provided by ScheduledCache is not real-time. In other words, even if the cleaning interval requirements have been met, as long as the four methods getSize, putObject, getObject, and removeObject are not called, the clearWhenStale method will not be triggered, and the cache cleaning operation will not occur.

This non-real-time design method is also worthy of reference, because real-time operations require the addition of a separate timing thread, which consumes a lot of resources; this non-real-time method saves resources, but at the same time it does not cause too much error.

2.3.3.6 Serialization decorator

We know that after an object (that is, data) is placed in the cache, if it is read multiple times, the reference to the same object will be read multiple times. That is, objects in the cache are shared among multiple references. This means that if the properties of the object are modified after reading, it will directly cause the object in the cache to also change.
Insert image description here
In some scenarios, we don't want external references to pollute the objects in the cache. At this time, it must be ensured that when the object in the cache is read externally, a brand new copy is read every time instead of a reference. The serialization decorator SerializedCache adds this functionality to caching.

After using SerializedCache, every time an object is written to the cache, the serialized string of the object is actually written; and every time the object is read, the serialized string is deserialized and returned. Through the process of serialization and deserialization, it is ensured that the object given by the cache is a brand new object each time, and modifications to the object will not affect the objects in the cache. Of course, this requires that the cached data must be serializable, otherwise SerializedCache will throw an exception.

  /**
   * 向缓存写入一条信息
   * @param key 信息的键
   * @param object 信息的值
   */
  @Override
  public void putObject(Object key, Object object) {
    
    
    if (object == null || object instanceof Serializable) {
    
     // 要缓存的数据必须是可以序列化的
      // 将数据序列化后写入缓存
      delegate.putObject(key, serialize((Serializable) object));
    } else {
    
     // 要缓存的数据不可序列化
      // 抛出异常
      throw new CacheException("SharedCache failed to make a copy of a non-serializable object: " + object);
    }
  }

  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    // 读取缓存中的序列化串
    Object object = delegate.getObject(key);
    // 反序列化后返回
    return object == null ? null : deserialize((byte[]) object);
  }

2.4 Cache construction

The process of building a cache is to add various decorations to the basic implementation of the cache according to requirements. This process is completed in CacheBuilder. Let's learn how MyBatis builds a cache through the source code of CacheBuilder.
The entry method to build a cache is the build method in CacheBuilder:

  /**
   * 组建缓存
   * @return 缓存对象
   */
  public Cache build() {
    
    
    // 设置缓存的默认实现、默认装饰器(仅设置,并未装配)
    setDefaultImplementations();
    // 创建默认的缓存
    Cache cache = newBaseCacheInstance(implementation, id);
    // 设置缓存的属性
    setCacheProperties(cache);
    if (PerpetualCache.class.equals(cache.getClass())) {
    
     // 缓存实现是PerpetualCache,即不是用户自定义的缓存实现
      // 为缓存逐级嵌套自定义的装饰器
      for (Class<? extends Cache> decorator : decorators) {
    
    
        // 生成装饰器实例,并装配。入参依次是装饰器类、被装饰的缓存
        cache = newCacheDecoratorInstance(decorator, cache);
        // 为装饰器设置属性
        setCacheProperties(cache);
      }
      // 为缓存增加标准的装饰器
      cache = setStandardDecorators(cache);
    } else if (!LoggingCache.class.isAssignableFrom(cache.getClass())) {
    
    
      // 增加日志装饰器
      cache = new LoggingCache(cache);
    }
    // 返回被包装好的缓存
    return cache;
  }
  /**
   * 设置缓存的默认实现和默认装饰器
   */
  private void setDefaultImplementations() {
    
    
    if (implementation == null) {
    
    
      implementation = PerpetualCache.class;
      if (decorators.isEmpty()) {
    
    
        decorators.add(LruCache.class);
      }
    }
  }

The setDefaultImplementations sub-method is responsible for setting the default implementation and default decorator of the cache. It can be seen that when no external implementation class is specified, the default implementation class of the cache is the PerpetualCache class, and the default cleanup decorator is LruCache. It should be noted that this method only puts the default implementation class into the implementation attribute and LruCache into the decorators attribute, but does not actually produce and assemble the cache.

Next, the cache implementation will be generated through the newBaseCacheInstance method, and user-defined decorators will be wrapped step by step. Finally, standard decorators will be added to the cache through the setStandardDecorators method. In the mapping file, we can specify the characteristics of the cache.

    <cache type="PERPETUAL"
           eviction="FIFO"
           flushInterval="60000"
           size="512"
           readOnly="true"
           blocking="true">
        <property name="timeout" value="20"/> <!--可以加入property节点,将用来直接修改Cache对象的属性-->
    </cache>

The cache characteristics set by the setStandardDecorators method determine which decorators are added to the cache.

  /**
   * 为缓存增加标准的装饰器
   * @param cache 被装饰的缓存
   * @return 装饰结束的缓存
   */
  private Cache setStandardDecorators(Cache cache) {
    
    
    try {
    
    
      MetaObject metaCache = SystemMetaObject.forObject(cache);
      // 设置缓存大小
      if (size != null && metaCache.hasSetter("size")) {
    
    
        metaCache.setValue("size", size);
      }
      // 如果定义了清理间隔,则使用定时清理装饰器装饰缓存
      if (clearInterval != null) {
    
    
        cache = new ScheduledCache(cache);
        ((ScheduledCache) cache).setClearInterval(clearInterval);
      }
      // 如果允许读写,则使用序列化装饰器装饰缓存
      if (readWrite) {
    
    
        cache = new SerializedCache(cache);
      }
      // 使用日志装饰器装饰缓存
      cache = new LoggingCache(cache);
      // 使用同步装饰器装饰缓存
      cache = new SynchronizedCache(cache);
      // 如果启用了阻塞功能,则使用阻塞装饰器装饰缓存
      if (blocking) {
    
    
        cache = new BlockingCache(cache);
      }
      // 返回被层层装饰的缓存
      return cache;
    } catch (Exception e) {
    
    
      throw new CacheException("Error building standard cache decorators.  Cause: " + e, e);
    }
  }  /**
   * 为缓存增加标准的装饰器
   * @param cache 被装饰的缓存
   * @return 装饰结束的缓存
   */
  private Cache setStandardDecorators(Cache cache) {
    
    
    try {
    
    
      MetaObject metaCache = SystemMetaObject.forObject(cache);
      // 设置缓存大小
      if (size != null && metaCache.hasSetter("size")) {
    
    
        metaCache.setValue("size", size);
      }
      // 如果定义了清理间隔,则使用定时清理装饰器装饰缓存
      if (clearInterval != null) {
    
    
        cache = new ScheduledCache(cache);
        ((ScheduledCache) cache).setClearInterval(clearInterval);
      }
      // 如果允许读写,则使用序列化装饰器装饰缓存
      if (readWrite) {
    
    
        cache = new SerializedCache(cache);
      }
      // 使用日志装饰器装饰缓存
      cache = new LoggingCache(cache);
      // 使用同步装饰器装饰缓存
      cache = new SynchronizedCache(cache);
      // 如果启用了阻塞功能,则使用阻塞装饰器装饰缓存
      if (blocking) {
    
    
        cache = new BlockingCache(cache);
      }
      // 返回被层层装饰的缓存
      return cache;
    } catch (Exception e) {
    
    
      throw new CacheException("Error building standard cache decorators.  Cause: " + e, e);
    }
  }

By reading the source code of the CacheBuilder class, we know that the process of adding functionality to the cache is the process of adding decorators. At the same time, you can also feel the power and flexibility of the decorator mode.

2.5 Transaction cache

In database operations, if a transaction is not explicitly declared, a statement itself is a transaction. After the query statement performs the database query operation, the corresponding query results can be immediately put into the cache for later use.

So, after the statements in the transaction perform database query operations, can the corresponding query results be immediately put into the cache for backup?
Obviously not. For example, in the transaction operation shown below, the query result obtained by the SELECT operation actually contains the information inserted by the previous INSERT statement. If the query results are put into the cache immediately after the SELECT query is completed, the cache will contain the information in the transaction before the transaction is committed, which is against the transaction definition. And if the transaction is rolled back later, the data in the cache will be inconsistent with the data in the database.

start transaction ;
insert into `user`(`name`,`email`,`age`,`sex`) values (`hzg`,`[email protected]`,`18`,``);
select * from `user`;
commit;

Therefore, the data generated in the transaction operation needs to be written to the cache when the transaction is committed, and directly destroyed when the transaction is rolled back. The TransactionalCache decorator provides this functionality for caching.

The attribute entriesToAddOnCommit of the TransactionalCache class temporarily saves the data generated in the transaction, submits it to the cache when the transaction is committed, and destroys it directly when the transaction is rolled back. The TransactionalCache class also supports limiting the scope of the cache to transactions, as long as the clearOnCommit attribute is set to true. In this way, as soon as the transaction ends, the temporarily saved data will be destroyed directly instead of being written to the cache.

  // 被装饰的对象
  private final Cache delegate;
  // 事务提交后是否直接清理缓存
  private boolean clearOnCommit;
  // 事务提交时需要写入缓存的数据
  private final Map<Object, Object> entriesToAddOnCommit;
  // 缓存查询未命中的数据
  private final Set<Object> entriesMissedInCache;

Cache read and write operations in the TransactionalCache class. It can be seen that when reading the cache, it is actually read from the cache, but when writing to the cache, it is only temporarily stored inside the TransactionalCache object.

  /**
   * 从缓存中读取一条信息
   * @param key 信息的键
   * @return 信息的值
   */
  @Override
  public Object getObject(Object key) {
    
    
    // 从缓存中读取对应的数据
    Object object = delegate.getObject(key);
    if (object == null) {
    
     // 缓存未命中
      // 记录该缓存未命中
      entriesMissedInCache.add(key);
    }
    if (clearOnCommit) {
    
     // 如果设置了提交时立马清除,则直接返回null
      return null;
    } else {
    
    
      // 返回查询的结果
      return object;
    }
  }

When a transaction is committed or rolled back, TransactionalCache will write the data it saves into the cache or destroy it directly according to the settings.

  /**
   * 提交事务
   */
  public void commit() {
    
    
    if (clearOnCommit) {
    
     // 如果设置了事务提交后清理缓存
      // 清理缓存
      delegate.clear();
    }
    // 将为写入缓存的操作写入缓存
    flushPendingEntries();
    // 清理环境
    reset();
  }

  /**
   * 回滚事务
   */
  public void rollback() {
    
    
    // 删除缓存未命中的数据
    unlockMissedEntries();
    reset();
  }

At this point, we have a clear understanding of the transaction cache TransactionalCache, especially the entriesToAddOnCommit attribute used to temporarily store data within the transaction. However, what is the purpose of entriesMissedInCache attribute? Why should query cache miss data be saved there?

This needs to be considered in combination with the blocking decorator BlockingCache. The cache used in the transaction cache may be decorated by BlockingCache, which means that if the result of the cache query is null, it will cause the data to be locked, thus blocking subsequent queries to the data. After the transaction is committed or rolled back, all the data in the cache should be unlocked. entriesMissedInCache saves the keys of these data and unlocks the data at the end of the transaction.

In a transaction, multiple caches may be involved. TransactionalCacheManager is used to manage multiple caches in a transaction. The transactionalCaches attribute stores multiple caches and corresponding caches decorated with cache decorators.

  // 管理多个缓存的映射
  private final Map<Cache, TransactionalCache> transactionalCaches = new HashMap<>();
  /**
   * 事务提交
   */
  public void commit() {
    
    
    for (TransactionalCache txCache : transactionalCaches.values()) {
    
    
      txCache.commit();
    }
  }

  /**
   * 事务回滚
   */
  public void rollback() {
    
    
    for (TransactionalCache txCache : transactionalCaches.values()) {
    
    
      txCache.rollback();
    }
  }

TransactionalCacheManager will trigger the commit and rollback of all related transaction caches when a transaction is committed and rolled back

2.6 MyBatis caching mechanism

When reading source code, you can usually do it in units of packages, because the package itself is a collection of classes with certain structures and functions. However, there will always be some functions that are relatively complex and span multiple packages. Therefore, it is also necessary to read the source code in multiple packages at one time with the function as the main line. It can help us clarify the causes and consequences of the implementation of a function.

This time, we will learn more about MyBatis' caching mechanism across multiple packages.

We have introduced the entire source code of the cache package in detail before, and learned how MyBatis uses different decorators to decorate it to obtain cache with different functions. However, the cache package does not cover the specific use of cache.

In the executor package, MyBatis implements two-level caching based on the cache provided in the cache package. Next we will learn more about the caching mechanism of MyBatis. Before introducing the caching mechanism of MyBatis, let's first understand the overview of the Executor interface in advance.

The Executor interface is the executor interface, which is responsible for operations such as database queries. It has two direct subclasses, CachingExecutor class and BaseExecutor class.

  • CachingExecutor is a decorator class that can add caching functionality to the executor implementation class.
  • The BaseExecutor class is the base class of all actual executor classes. It has four subclasses: SimpleExecutor, BatchExecutor, ReuseExecutor, and ClosedExecutor. The ClosedExecutor subclass itself has no actual function, so we will ignore it for now.

Insert image description here

2.6.1 Level 1 cache

MyBatis's first-level cache is also called local cache. Its structure and use are relatively simple. There are two configuration items related to it. One is under the settings node of the configuration file, we can add the following configuration statement to change the scope of the first-level cache. The optional configuration value options are SESSION and STATEMENT, which correspond to a session and a statement respectively. The default scope of the first-level cache is SESSION.

<setting name="localCacheScope" value="SESSION"/>

Second, you can add the flushCache attribute item in the database operation node of the mapping file. This attribute can be set to true or false. When set to true, MyBatis will clear the primary and secondary caches before executing the database operation. The default value of this property is false.

<select id="queryUserBySchoolName" resultType="com.github.yeecode.mybatisdemo.User" flushCache="false">
    SELECT * FROM `user`
    <if test="schoolName != null">
        WHERE schoolName = #{schoolName}
    </if>
</select>

The first-level cache function is implemented by the BaseExecutor class. As the base class of actual executors, the BaseExecutor class provides some common basic functions for all actual executors. Adding cache here means that each actual executor has this level of cache.
In BaseExecutor, you can see two properties related to the first-level cache, namely localCache and localOutputParameterCache. These two properties use PerpetualCache objects that have not been decorated with any decorators.

  // 查询操作的结果缓存
  protected PerpetualCache localCache;
  // Callable查询的输出参数缓存
  protected PerpetualCache localOutputParameterCache;

Of these two variables, localCache caches the results of database query operations. For statements in the CALLABLE form, because the output parameters are ultimately returned upward, the output parameters directly cached by localOutputParameterCache are used.

The source code of the query operation in BaseExecutor, through which we can understand in detail the working principle of the first-level cache, and how the localCacheScope configuration and flushCache configuration take effect.

 /**
   * 查询数据库中的数据
   * @param ms 映射语句
   * @param parameter 参数对象
   * @param rowBounds 翻页限制条件
   * @param resultHandler 结果处理器
   * @param key 缓存的键
   * @param boundSql 查询语句
   * @param <E> 结果类型
   * @return 结果列表
   * @throws SQLException
   */
  @SuppressWarnings("unchecked")
  @Override
  public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
    
    
    ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());
    if (closed) {
    
    
      // 执行器已经关闭
      throw new ExecutorException("Executor was closed.");
    }
    if (queryStack == 0 && ms.isFlushCacheRequired()) {
    
     // 新的查询栈且要求清除缓存
      // 清除一级缓存
      clearLocalCache();
    }
    List<E> list;
    try {
    
    
      queryStack++;
      // 尝试从本地缓存获取结果
      list = resultHandler == null ? (List<E>) localCache.getObject(key) : null;
      if (list != null) {
    
    
        // 本地缓存中有结果,则对于CALLABLE语句还需要绑定到IN/INOUT参数上
        handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);
      } else {
    
    
        // 本地缓存没有结果,故需要查询数据库
        list = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql);
      }
    } finally {
    
    
      queryStack--;
    }
    if (queryStack == 0) {
    
    
      // 懒加载操作的处理
      for (DeferredLoad deferredLoad : deferredLoads) {
    
    
        deferredLoad.load();
      }
      deferredLoads.clear();
      // 如果本地缓存的作用域为STATEMENT,则立刻清除本地缓存
      if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {
    
    
        clearLocalCache();
      }
    }
    return list;
  }

The INSERT, UPDATE, and DELETE operations in database operations all correspond to the update method in BaseExecutor. In the update method, the update of the first-level cache will be triggered.

  /**
   * 更新数据库数据,INSERT/UPDATE/DELETE三种操作都会调用该方法
   * @param ms 映射语句
   * @param parameter 参数对象
   * @return 数据库操作结果
   * @throws SQLException
   */
  @Override
  public int update(MappedStatement ms, Object parameter) throws SQLException {
    
    
    ErrorContext.instance().resource(ms.getResource())
            .activity("executing an update").object(ms.getId());
    if (closed) {
    
    
      // 执行器已经关闭
      throw new ExecutorException("Executor was closed.");
    }
    // 清理本地缓存
    clearLocalCache();
    // 返回调用子类进行操作
    return doUpdate(ms, parameter);
  }

It can be seen that the first-level cache is the two PerpetualCache type attributes in BaseExecutor. Its scope is very limited and does not support modification by various decorators. Therefore, capacity configuration, cleaning policy settings, blocking settings, etc. cannot be performed.

2.6.2 Second level cache

The scope of the second-level cache is a namespace (that is, a mapping file), and multiple namespaces can share a cache. Therefore, compared with the first-level cache, its scope is wider and the selection is more flexible.

There are four configuration items related to the second-level cache.

  1. The first configuration item is under the settings node of the configuration file. We can add the following configuration statements to enable and disable the second-level cache. The default value of this configuration item is true, that is, the second-level cache is enabled by default.
<setting name="cacheEnabled" value="true"/>
  1. The second configuration item is in the mapping file. You can use the cache tag shown below to enable and configure the cache of this namespace, or you can use

    the tag shown to declare that this namespace uses the cache of other namespaces. If neither is configured, it means that the namespace has no cache. This configuration is only effective when enabling second-level cache is selected in the first configuration.
<cache type="PERPETUAL"
       eviction="FIFO"
       flushInterval="60000"
       size="512"
       readOnly="true"
       blocking="true">
    <property name="timeout" value="20"/> <!--可以加入property节点,将用来直接修改Cache对象的属性-->
</cache>
  1. The third configuration item is the useCache attribute in the database operation node, as shown in the following code. Through it, you can configure whether the database operation node uses the second level cache. This configuration is only effective when caching is enabled in both the first and second configurations. For SELECT type statements, the default value of the useCache attribute is true, which is meaningless for other types of statements.
<select id="queryUserBySchoolName" resultType="com.github.yeecode.mybatisdemo.User" flushCache="false" useCache="true">
  SELECT * FROM `user`
   <if test="schoolName != null">
       WHERE schoolName = #{schoolName}
   </if>
</select>
  1. The fourth configuration item is the flushCache attribute item in the database operation node. This configuration attribute is shared with the first-level cache and indicates whether the first-level and second-level caches should be cleared before the statement is executed.

The second-level cache function is implemented by the CachingExecutor class, which is a decorator class that can add second-level cache functions to actual executors by decorating them. As shown in the following code, in the newExecutor method of Configuration, MyBatis will decorate the actual executor with the CachingExecutor class according to the second-level cache switch configuration in the configuration file.

  /**
   * 创建一个执行器
   * @param transaction 事务
   * @param executorType 数据库操作类型
   * @return 执行器
   */
  public Executor newExecutor(Transaction transaction, ExecutorType executorType) {
    
    
    executorType = executorType == null ? defaultExecutorType : executorType;
    executorType = executorType == null ? ExecutorType.SIMPLE : executorType;
    Executor executor;
    // 根据数据操作类型创建实际执行器
    if (ExecutorType.BATCH == executorType) {
    
    
      executor = new BatchExecutor(this, transaction);
    } else if (ExecutorType.REUSE == executorType) {
    
    
      executor = new ReuseExecutor(this, transaction);
    } else {
    
    
      executor = new SimpleExecutor(this, transaction);
    }
    // 根据配置文件中settings节点cacheEnabled配置项确定是否启用缓存
    if (cacheEnabled) {
    
     // 如果配置启用缓存
      // 使用CachingExecutor装饰实际执行器
      executor = new CachingExecutor(executor);
    }
    // 为执行器增加拦截器(插件),以启用各个拦截器的功能
    executor = (Executor) interceptorChain.pluginAll(executor);
    return executor;
  }

Before reading the source code of the CachingExecutor class, let's discuss another concept: transactions. We know that in database operations, multiple statements can be encapsulated into one transaction; and when we do not explicitly declare a transaction, the database will open a transaction for each statement. Therefore, transactions can not only refer to multiple statements packaged together, but can also be used to refer to an ordinary statement.

There are two properties in the CachingExecutor class, where delegate is the actual executor that is decorated and tcm is the transaction cache manager. Since a statement is also a transaction, the transaction cache manager can be applied in scenarios with transactions or without transactions.

  // 被装饰的执行器
  private final Executor delegate;
  // 事务缓存管理器
  private final TransactionalCacheManager tcm = new TransactionalCacheManager();
  /**
   * 查询数据库中的数据
   * @param ms 映射语句
   * @param parameterObject 参数对象
   * @param rowBounds 翻页限制条件
   * @param resultHandler 结果处理器
   * @param key 缓存的键
   * @param boundSql 查询语句
   * @param <E> 结果类型
   * @return 结果列表
   * @throws SQLException
   */
  @Override
  public <E> List<E> query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql)
      throws SQLException {
    
    
    // 获取MappedStatement对应的缓存,可能的结果有:该命名空间的缓存、共享的其它命名空间的缓存、无缓存
    Cache cache = ms.getCache();
    // 如果映射文件未设置<cache>或<cache-ref>则,此处cache变量为null
    if (cache != null) {
    
     // 存在缓存
      // 根据要求判断语句执行前是否要清除二级缓存,如果需要,清除二级缓存
      flushCacheIfRequired(ms);
      if (ms.isUseCache() && resultHandler == null) {
    
     // 该语句使用缓存且没有输出结果处理器
        // 二级缓存不支持含有输出参数的CALLABLE语句,故在这里进行判断
        ensureNoOutParams(ms, boundSql);
        // 从缓存中读取结果
        @SuppressWarnings("unchecked")
        List<E> list = (List<E>) tcm.getObject(cache, key);
        if (list == null) {
    
     // 缓存中没有结果
          // 交给被包装的执行器执行
          list = delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
          // 缓存被包装执行器返回的结果
          tcm.putObject(cache, key, list); // issue #578 and #116
        }
        return list;
      }
    }
    // 交由被包装的实际执行器执行
    return delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
  }
  /**
   * 根据要求判断语句执行前是否要清除二级缓存,如果需要,清除二级缓存
   * 注意:默认情况下,非SELECT语句的isFlushCacheRequired方法会返回true
   * @param ms MappedStatement
   */
  private void flushCacheIfRequired(MappedStatement ms) {
    
    
    // 获取MappedStatement对应的缓存
    Cache cache = ms.getCache();
    if (cache != null && ms.isFlushCacheRequired()) {
    
     // 存在缓存且该操作语句要求执行前清除缓存
      // 清除事务中的缓存
      tcm.clear(cache);
    }
  }

The update method of CachingExecutor (corresponding to three database operations: INSERT, UPDATE, and DELETE) will also call the flushCacheIfRequired method, and the isFlushCacheRequired submethod always returns true for these statements. Therefore, it will always cause the second level cache to be cleared.

2.6.3 Two-level caching mechanism

Now we know that MyBatis has two levels of cache, the first level cache is provided by BaseExecutor through two PerpetualCache type properties, and the second level cache is provided by the CachingExecutor wrapper class.

So in a database query operation, should the first-level cache or the second-level cache be accessed first? To facilitate discussion, let’s look at the picture again
Insert image description here

Simplified class diagram of the Executor interface.

The answer is not complicated. CachingExecutor as a decorator will run first, and then the actual executor will be called. Only then will the methods in BaseExecutor be executed. Therefore, during database query operations, MyBatis will access the second-level cache first and then the first-level cache.

In this way, we can get the MyBatis two-level cache diagram as shown below.
Insert image description here

Guess you like

Origin blog.csdn.net/d495435207/article/details/130915755