multi-level cache
Multi-level caching solution
Multi-level caching is to make full use of every aspect of request processing and add cache respectively to reduce the pressure on Tomcat and improve server performance:
The Nginx used as cache is business Nginx, which needs to be deployed as a cluster, and a dedicated Nginx is used as a reverse proxy:
JVM process cache
1. Install MySQL
1.1. Prepare directory
In order to facilitate the later configuration of MySQL, we first prepare two directories for mounting the container's data and configuration file directories:
# 进入/tmp目录
cd /tmp
# 创建文件夹
mkdir mysql
# 进入mysql目录
cd mysql
1.2. Run command
After entering the mysql directory, execute the following Docker command:
docker run \
-p 3306:3306 \
--name mysql \
-v $PWD/conf:/etc/mysql/conf.d \
-v $PWD/logs:/logs \
-v $PWD/data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=xcxc666 \
--privileged \
-d \
mysql:5.7.25
1.3. Modify configuration
Add a my.cnf file in the /tmp/mysql/conf directory as the mysql configuration file:
# 创建文件
touch /tmp/mysql/conf/my.cnf
The contents of the file are as follows:
[mysqld]
skip-name-resolve
character_set_server=utf8
datadir=/var/lib/mysql
server-id=1000
1.4. Restart
After the configuration is modified, the container must be restarted:
docker restart mysql
local process cache
Caching plays a vital role in daily development. Because it is stored in memory, the reading speed of data is very fast, which can greatly reduce database access and reduce the pressure on the database. We divide caches into two categories:
-
Distributed cache, such as Redis:
-
- Advantages: larger storage capacity, better reliability, and can be shared among clusters
- Disadvantages: There is network overhead for accessing the cache
- Scenario: The amount of cached data is large, reliability requirements are high, and it needs to be shared between clusters
-
Process local cache, such as HashMap, GuavaCache:
-
- Advantages: Reading local memory, no network overhead, faster
- Disadvantages: limited storage capacity, low reliability, and cannot be shared
- Scenario: high performance requirements and small amount of cached data
Caffeine official website: https://github.com/ben-manes/caffeine/wiki/Home-zh-CN
Caffeine example
@Test
void testBasicOps() {
// 创建缓存对象
Cache<String, String> cache = Caffeine.newBuilder().build();
// 存数据
cache.put("gf", "迪丽热巴");
// 取数据,不存在则返回null
String gf = cache.getIfPresent("gf");
System.out.println("gf = " + gf);
// 取数据,不存在则去数据库查询
String defaultGF = cache.get("defaultGF", key -> {
// 这里可以去数据库根据 key查询value
return "柳岩";
});
System.out.println("defaultGF = " + defaultGF);
}
Caffeine provides three cache eviction strategies:
- Capacity-based: set an upper limit on the number of caches
// 创建缓存对象
Cache<String, String> cache = Caffeine.newBuilder()
.maximumSize(1) // 设置缓存大小上限为1
.build();
- Time-based: Set the effective time of the cache
// 创建缓存对象
Cache<String, String> cache = Caffeine.newBuilder()
.expireAfterWrite(Duration.ofSeconds(1)) // 设置缓存有效时间为10秒,从最后一次写入开始计时
.build();
- Reference-based: Set the cache to soft reference or weak reference, and use GC to recycle cached data. Poor performance, not recommended
By default, Caffeine does not automatically clean up and evict a cache element immediately when it expires. Instead, the eviction of invalid data is completed after a read or write operation, or during idle time.
Case: Implementing local process cache for product query
Use Caffeine to achieve the following requirements:
- Cache initial size 100
- The cache limit is 10000
Give the bean to spring management
package com.heima.item.config;
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import com.heima.item.pojo.Item;
import com.heima.item.pojo.ItemStock;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* @author xc
* @date 2023/5/16 21:12
*/
@Configuration
public class CaffeineConfig {
@Bean
public Cache<Long, Item> itemCache(){
return Caffeine.newBuilder()
// 初始缓存大小
.initialCapacity(100)
// 最大缓存大小
.maximumSize(10_000)
.build();
}
@Bean
public Cache<Long, ItemStock> stockCache(){
return Caffeine.newBuilder()
.initialCapacity(100)
.maximumSize(10_000)
.build();
}
}
- Add a cache to the business of querying products based on ID, and query the database when the cache misses
@GetMapping("/{id}")
public Item findById(@PathVariable("id") Long id) {
return itemCache.get(id, key -> itemService.query()
.ne("status", 3).eq("id", key)
.one()
);
}
- Add a cache to the business of querying commodity inventory based on id, and query the database when the cache misses
@GetMapping("/stock/{id}")
public ItemStock findStockById(@PathVariable("id") Long id) {
return stockCache.get(id,key -> stockService.getById(key));
}
Introduction to Lua syntax
Initial Lua
Lua is a lightweight and compact scripting language. Official website: https://www.lua.org/
HelloWorld
type of data
variable
When declaring a variable in Lua, you do not need to specify the data type:
-- 声明字符串
local str = 'hello'
-- 声明数字
local num = 21
-- 声明布尔类型
local flag = true
-- 声明数组 key为索引的table
local arr = {
'java','python','lua'}
-- 声明table,类似java的map
local map = {
name='Jack', age=21}
access table:
-- 访问数组(下标从1开始)
print(arr[1])
-- 访问table
print(map['name'])
print(map.name)
cycle
Both arrays and tables can be traversed using for loops:
- Iterate over the array:
-- 声明数组 key为索引的table
local arr = {
'java', 'python', 'lua'}
-- 遍历数组
for index,value in ipairs(arr) do
print(index,value)
end
- Traversing the table:
-- 声明map
local map = {
name='Jack', age=21}
-- 遍历数组
for index,value in pairs(map) do
print(index,value)
end
function
Syntax for defining a function:
function 函数名(arg1,arg2....)
--函数体
return 返回值
end
For example, define a function to print an array:
function printArr(arr)
for index,value in ipairs(arr) do
print(index,value)
end
end
Conditional control
Java-like conditional control, such as if, else syntax
if(布尔表达式)
then
--[为true的语句块]
else
--[为false的语句块]
end
Unlike java, logical operations in boolean expressions are based on English words
Case: Custom function, print table
Requirement: Customize a function that can print the table and print an error message when the parameter is nil.
function printTable(table)
if(not table)
then
print('arg is nil')
reutrn nil
else
for k,v in pairs(table) do
print(k,v)
end
end
end
local map = {
name='Jack', age=21}
printTable(map)
print('-------')
printTable(nil)
operation result:
multi-level cache
InitialOpenResty
OpenResty is a high-performance web platform based on Nginx, which is used to easily build dynamic web applications, web services and dynamic gateways that can handle ultra-high concurrency and high scalability. Has the following characteristics:
- Has the full functionality of Nginx
- Expanded based on Lua language, integrating a large number of sophisticated Lua libraries and third-party modules
- Allows the use of Lua custom business logic and custom libraries
Official website: https://openresty.org/cn/
Install OpenResty
1.Installation
First, your Linux virtual machine must be connected to the Internet
1) Install development library
First, install the dependency development library of OpenResty and execute the command:
yum install -y pcre-devel openssl-devel gcc --skip-broken
2) Install the OpenResty repository
You can add the openresty repository to your CentOS system to facilitate future installation or updates of our packages (via the yum check-update command). Run the following command to add our repository:
yum-config-manager --add-repo https://openresty.org/package/centos/openresty.repo
If it says the command does not exist, run:
yum install -y yum-utils
Then repeat the above command
3) Install OpenResty
Then you can install software packages, such as openresty, as follows:
yum install -y openresty
4) Install the opm tool
opm is a management tool of OpenResty that can help us install a third-party Lua module.
If you want to install the command line tool opm, you can install the openresty-opm package as follows:
yum install -y openresty-opm
5) Directory structure
By default, the directory where OpenResty is installed is: /usr/local/openresty
Have you seen the nginx directory inside? OpenResty integrates some Lua modules based on Nginx.
6) Configure nginx environment variables
Open the configuration file:
vi /etc/profile
Add two lines at the bottom:
export NGINX_HOME=/usr/local/openresty/nginx
export PATH=${NGINX_HOME}/sbin:$PATH
NGINX_HOME: followed by the nginx directory under the OpenResty installation directory
Then let the configuration take effect:
source /etc/profile
2. Get up and running
The bottom layer of OpenResty is based on Nginx. View the nginx directory of the OpenResty directory. The structure is basically the same as that of nginx installed in Windows:
So the running method is basically the same as nginx:
# 启动nginx
nginx
# 重新加载配置
nginx -s reload
# 停止
nginx -s stop
The default configuration file of nginx has too many comments, which will affect our subsequent editing. Here, delete the comments in nginx.conf and keep the valid parts.
Modify the /usr/local/openresty/nginx/conf/nginx.conf file as follows:
#user nobody;
worker_processes 1;
error_log logs/error.log;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
server {
listen 8081;
server_name localhost;
location / {
root html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
Enter the command in the Linux console to start nginx:
nginx
Then visit the page: http://192.168.72.133:8081, pay attention to replace the ip address with your own virtual machine IP:
3.Remarks
Load OpenResty's lua module:
#lua 模块
lua_package_path "/usr/local/openresty/lualib/?.lua;;";
#c模块
lua_package_cpath "/usr/local/openresty/lualib/?.so;;";
OpenResty gets request parameters
Multi-level caching requirements
Case: Obtain the product id information in the request path, and query the product information from Tomcat based on the id
Here we need to modify item.lua to meet the following requirements:
1. Get the id in the request parameter
2. Send a request to the Tomcat service based on the ID to query product information.
3. Send a request to the Tomcat service based on the ID to query inventory information
4. Assemble product information and inventory information, serialize and return in JSON format
nginx sends HTTP request internally
nginx provides an internal API to send http requests
local resp = ngx.location.capture("/path",{
method = ngx.HTTP_GET, -- 请求方式
args = {
a=1,b=2}, -- get方式传参数
body = "c=3&d=4" -- post方式传参数
})
The response content returned includes:
- resp.status: response status code
- resp.header: response header, which is a table
- resp.body: response body, which is the response data
**Note:** The path here does not include IP and port. This request will be monitored and processed by the server inside nginx.
When we want this request to be sent to the Tomcat server, we also need to write a server to reverse proxy the path:
location /path{
# 保证win防火墙关闭
proxy_pass http://192.168.72.1:8081;
}
Function that encapsulates http query
common.lua
-- 封装函数,发送http请求,并解析响应
local function read_http(path, params)
local resp = ngx.location.capture(path,{
method = ngx.HTTP_GET,
args = params,
})
if not resp then
-- 记录错误信息,返回404
ngx.log(ngx.ERR, "http not found, path: ", path , ", args: ", args)
ngx.exit(404)
end
return resp.body
end
-- 将方法导出
local _M = {
read_http = read_http
}
return _M
JSON result processing
OpenResty provides a cjson module to handle JSON serialization and deserialization
Official website: https://github.com/openresty/lua-cjson
Load balancing of Tomcat cluster
Add the requirement of redis cache
Cold start and cache warm-up
**Cold start:** When the service is just started, there is no cache in Redis. If all product data is cached during the first query, it may put greater pressure on the database.
**Cache warm-up:** In actual development, we can use big data to count the hot data accessed by users, and query these hot data in advance and save them to Redis when the project starts.
Cache warm-up
Query Redis cache
Redis module for OpenResty
OpenResty provides a module for operating Redis. We can use it directly as long as we introduce this module:
- Introduce the Redis module and initialize the Redis object
-- 引入redis模块
local redis = require("resty.redis")
-- 初始化Redis对象
local red = redis:new()
-- 设置Redis超时时间
red:set_timeouts(1000,1000,1000)
- The encapsulation function is used to release the Redis connection. It is actually put into the connection pool.
-- 关闭redis连接的工具方法,其实是放入连接池
local function close_redis(red)
local pool_max_idle_time = 10000 -- 连接的空闲时间,单位是毫秒
local pool_size = 100 --连接池大小
local ok, err = red:set_keepalive(pool_max_idle_time, pool_size)
if not ok then
ngx.log(ngx.ERR, "放入redis连接池失败: ", err)
end
end
- Encapsulate function, read data from Redis and return
-- 查询redis的方法 ip和port是redis地址,key是查询的key
local function read_redis(ip, port, key)
-- 获取一个连接
local ok, err = red:connect(ip, port)
if not ok then
ngx.log(ngx.ERR, "连接redis失败 : ", err)
return nil
end
-- 查询redis
local resp, err = red:get(key)
-- 查询失败处理
if not resp then
ngx.log(ngx.ERR, "查询Redis失败: ", err, ", key = " , key)
end
--得到的数据为空处理
if resp == ngx.null then
resp = nil
ngx.log(ngx.ERR, "查询Redis数据为空, key = ", key)
end
close_redis(red)
return resp
end
Add redis cache requirements
nginx local cache
OpenResty provides the shard dict function for Nginx, which can share data between multiple workers of nginx and implement caching functions.
- To enable shared dictionaries:
# 共享字典,也就是本地缓存,名称叫做:item_cache,大小150m
lua_shared_dict item_cache 150m;
- To operate a shared dictionary:
local item_cache = ngx.shared.item_cache
item_cache:set('key','value',1000)
local val = item_cache:get('key')
Cache synchronization strategy
There are three common ways to synchronize cache data:
-
Set validity period: Set the validity period for the cache, and it will be automatically deleted after expiration. Update when querying again
-
- Advantages: simple and convenient
- Disadvantages: poor timeliness, cache may be inconsistent before expiration
- Scenario: business with low update frequency and low timeliness requirements
-
Synchronous double write: directly modify the cache while modifying the database
-
- Advantages: strong timeliness, strong consistency between cache and database
- Disadvantages: code intrusion, high coupling
- Scenario: Cache data with high consistency and timeliness requirements
-
Asynchronous notification: Send time notification when modifying the database, and the relevant services modify the cached data after listening to the notification.
-
- Advantages: Low coupling, multiple cache services can be notified at the same time
- Disadvantages: general timeliness, there may be intermediate inconsistencies
- Scenario: The timeliness requirements are average and there are multiple services that need to be synchronized.
MQ-based asynchronous notification:
Canal-based asynchronous notification:
Initial Canal
Provide incremental data subscription & consumption based on database incremental log analysis
Official website: https://github.com/alibaba/canal
Canal is implemented based on mysql's master-slave synchronization. The principle of MySQL master-slave synchronization is as follows:
- MySQL master writes data changes to the binary log, and the data recorded in it is called binary log events
- MySQL slave copies the master's binary log events to its relay log
- MySQL slave replays the events in the relay log and reflects the data changes to its own data
Canal disguises itself as a slave node of MySQL to monitor the master's binary log changes. Then the obtained change information is notified to the Canal client, and then the synchronization of other databases is completed.
Install and configure Canal
Next, we will open the master-slave synchronization mechanism of mysql and let Canal simulate the salve
1. Start MySQL master-slave
Canal is based on the master-slave synchronization function of MySQL, so the master-slave function of MySQL must be enabled first.
Here is an example of mysql running with Docker:
1.1. Start binlog
Open the log file mounted by the mysql container. Mine is in the /tmp/mysql/conf directory:
Modify file:
vi /tmp/mysql/conf/my.cnf
Add content:
log-bin=/var/lib/mysql/mysql-bin
binlog-do-db=heima
Interpretation of configuration:
- log-bin=/var/lib/mysql/mysql-bin: Set the storage address and file name of the binary log file, called mysql-bin
- binlog-do-db=heima: Specify which database to record binary log events. The heima library is recorded here.
final effect:
[mysqld]
skip-name-resolve
character_set_server=utf8
datadir=/var/lib/mysql
server-id=1000
log-bin=/var/lib/mysql/mysql-bin
binlog-do-db=heima
1.2.Set user permissions
Next, add an account only for data synchronization. For security reasons, only the operation permissions for the heima library are provided here.
create user canal@'%' IDENTIFIED by 'canal';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT,SUPER ON *.* TO 'canal'@'%' identified by 'canal';
FLUSH PRIVILEGES;
Just restart the mysql container
docker restart mysql
Test whether the setting is successful: In the mysql console or Navicat, enter the command:
show master status
2.Install Canal
2.1.Create a network
We need to create a network and put MySQL, Canal, and MQ into the same Docker network:
docker network create heima
Let mysql join this network:
docker network connect heima mysql
2.3. Install Canal
Canal.tar can be uploaded to the virtual machine and then imported via the command:
docker load -i canal.tar
Then run the command to create the Canal container:
docker run -p 11111:11111 --name canal \
-e canal.destinations=heima \
-e canal.instance.master.address=mysql:3306 \
-e canal.instance.dbUsername=canal \
-e canal.instance.dbPassword=canal \
-e canal.instance.connectionCharset=UTF-8 \
-e canal.instance.tsdb.enable=true \
-e canal.instance.gtidon=false \
-e canal.instance.filter.regex=heima\\..* \
--network heima \
-d canal/canal-server:latest
illustrate:
- -p 11111:11111: This is the default listening port of canal
- -e canal.instance.master.address=mysql:3306: database address and port. If you don’t know the mysql container address, you can check it through docker inspect container id.
- -e canal.instance.dbUsername=canal: database user name
- -e canal.instance.dbPassword=canal: database password
- -e canal.instance.filter.regex=: The name of the table to monitor
Syntax supported by table name listener:
mysql 数据解析关注的表,Perl正则表达式.
多个正则之间以逗号(,)分隔,转义符需要双斜杠(\\)
常见例子:
1. 所有表:.* or .*\\..*
2. canal schema下所有表: canal\\..*
3. canal下的以canal打头的表:canal\\.canal.*
4. canal schema下的一张表:canal.test1
5. 多个规则组合使用然后以逗号隔开:canal\\..*,mysql.test1,mysql.test2
Canal client
Canal provides clients in various languages. When Canal detects changes in the binlog, it will notify the Canal client.
Use the third-party open source canal-starter: https://github.com/NormanGyllenhaal/canal-client
Import dependencies:
<dependency>
<groupId>top.javatool</groupId>
<artifactId>canal-spring-boot-starter</artifactId>
<version>1.2.1-RELEASE</version>
</dependency>
Write configuration:
canal:
destination: heima
server: 192.168.72.133:1111
Write a listener to listen to Canal messages:
Canal pushes this modified row of data to canal-client, and the canal-client we introduced will help us encapsulate this row of data into the item entity class.