Article Directory
Nearby shops, user check-in, UV statistics
1. Nearby shops
The bottom layer is based on geographic coordinates for searching. There are many technologies that support geographic coordinates, and Redis is one of them.
1.1 GEO data structure
GEO is the abbreviation of Geolocation, which stands for geographic coordinates .
Redis added support for GEO in version 3.2, which allows storing geographic coordinate information and helping us retrieve data based on latitude and longitude. Common commands are :
-
GEOADD : Add a geospatial information, including: longitude (longitude), latitude (latitude), value (member)
This value can be anything, such as store name, a field in the database
-
GEODIST : Calculate the distance between the specified two points and return
You can choose the returned unit, m, km, etc.
- GEOHASH : convert the coordinates of the specified member into a hash string and return
- GEOPOS : returns the coordinates of the specified member
- GEORADIUS : Specify the center (radius) and radius, find all the members contained in the circle, sort them according to the distance from the center of the circle, and return them. Obsolete after 6.2
- GEOSEARCH : Search for members within the specified range, and return the list according to the distance from the specified point. The range can be round or rectangular. 6.2. New features
You can specify a member of our key as the center of the circle, or you can directly specify the latitude and longitude as the center of the circle
BYRADIUS is to search according to the radius of the circle
BYBOX searches by rectangle (specify length and width, etc.)
COUNT indicates how many items to query
WITHDIST means carrying distance
- GEOSEARCHSTORE : The function is the same as GEOSEARCH, but the result can be stored in a specified key. 6.2. New features
need
1. Add the following pieces of data
—Beijing South Railway Station (116.378248 39.865275)
—Beijing Railway Station (116.42803 39.903738 )
— Beijing West Railway Station (116.322287 39.893729)
GEOADD g1 116.378248 39.865275 bjn 116.42803 39.903738 bjz 116.322287 39.893729 bjx
After adding it, it is found that the underlying data structure is ZSET, which is SortedSet
The value in the figure below is the member we filled in. The left side of the geography we saved in is converted into a series of numbers below and passed in as a score
2. Calculate the distance from Beijing South Railway Station to Beijing West
GEODIST g1 bjn bjx
If the specified unit
GEODIST g1 bjn bjx km
3. Search all the train stations within 10km near Tiananmen Square (116.397904 39.909005 ), and sort them in ascending order of distance
GEOSEARCH g1 FROMLONLAT 116.397904 39.909005 BYRADIUS 10 km WITHDIST
1.2 Import store data to GEO
Look at the shop table tb_shop
When importing data to GEO, not all the information is imported, we only need to import the latitude and longitude coordinates and the store id, and the store id acts as a member in the GEO command
When we search, there is a restriction, filter according to the type of business, but we have not put the type of business into GEO, so we cannot filter
To solve this problem, we can take the following measures:
Group by merchant type, merchants of the same type are regarded as the same group, and can be stored in the same GEO collection with typeId as the Key
@Resource
private StringRedisTemplate stringRedisTemplate;
@Test
void loadShopData() {
// TODO 1. 查询所有店铺信息
List<Shop> list = shopService.list();
// TODO 2. 把店铺分组,按照typeId分组,id一致的放到一个集合
Map<Long, List<Shop>> map = list.stream().collect(Collectors.groupingBy(shop -> shop.getTypeId()));
// TODO 3. 分批完成存储写入Redis
for (Map.Entry<Long, List<Shop>> entry : map.entrySet()) {
// TODO 3.1 获取类型id
Long typeId = entry.getKey();
String key = "shop:geo:" + typeId;
// TODO 3.2 获取同类型的店铺集合
List<Shop> value = entry.getValue();
// TODO 3.3 写入Redis GEOADD key 经度 纬度 member
// 方法1:效率比较低,不采用
// for (Shop shop : value) {
// 坐标我们可以一个个指定,也可以直接new一个Point对象
// stringRedisTemplate.opsForGeo().add(key,new Point(shop.getX(),shop.getY()),shop.getId().toString());
// 方法2:
// }
// 方法2
List<RedisGeoCommands.GeoLocation<String>> locations = new ArrayList<>();
for (Shop shop : value) {
// 下面泛型的类型是member的类型
locations.add(new RedisGeoCommands.GeoLocation<>(
shop.getId().toString(),
new Point(shop.getX(), shop.getY())
));
}
// 批量操作
stringRedisTemplate.opsForGeo().add(key,locations);
}
}
Result graph :
1.3 Realize the function of nearby merchants
The Springboot version we use is not the latest, and the corresponding SpringDataRedis version is not the latest either
The 2.3.9 version of SpringDataRedis does not support the GEOSEARCH command provided by Redis 6.2, so we need to prompt its version and modify the Pom file
A plugin can be downloaded: Dependency Analyzer
<!--修改其中的版本-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-redis</artifactId>
</exclusion>
<exclusion>
<groupId>io.lettuce</groupId>
<artifactId>lettuce-core</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-redis</artifactId>
<!--使用2.6.2也可以-->
<version>2.7.11</version>
</dependency>
<dependency>
<groupId>io.lettuce</groupId>
<artifactId>lettuce-core</artifactId>
<version>6.1.10.RELEASE</version>
</dependency>
Interface Analysis Diagram
Controller layer
/**
* 根据商铺类型分页查询商铺信息
* @param typeId 商铺类型
* @param current 页码
* @return 商铺列表
*/
@GetMapping("/of/type")
public Result queryShopByType(
@RequestParam("typeId") Integer typeId,
@RequestParam(value = "current", defaultValue = "1") Integer current,
@RequestParam(value = "x",required = false) Double x,
@RequestParam(value = "y",required = false) Double y) {
return shopService.queryShopByType(typeId,current,x,y);
}
Service layer
@Override
public Result queryShopByType(Integer typeId, Integer current, Double x, Double y) {
// TODO 1.判断是否需要根据坐标查询
if (x == null || y == null) {
//不需要坐标查询,按数据库查询
Page<Shop> page = query()
.eq("type_id", typeId)
// SystemConstants.DEFAULT_PAGE_SIZE)==5
.page(new Page<>(current, SystemConstants.DEFAULT_PAGE_SIZE));
// 返回数据
return Result.ok(page.getRecords());
}
// TODO 2.计算分页参数
// 从哪开始
int from = (current-1)*SystemConstants.DEFAULT_PAGE_SIZE;
// 从哪结束
int end = current*SystemConstants.DEFAULT_PAGE_SIZE;
// TODO 3.查询redis,按照距离排序、分页。结果:shopId和distance
String key = "shop:geo:"+typeId;
GeoResults<RedisGeoCommands.GeoLocation<String>> results = stringRedisTemplate.opsForGeo()
// GEOSEARCH key FROMLONLAT x y BYRADIUS 10 km WITHDIST
// 第一个参数是key,第二个参数是圆心,第三个参数是半径,我们选择半径5000米以内的
.search(key, GeoReference.fromCoordinate(x, y), new Distance(5000),
RedisGeoCommands.GeoSearchCommandArgs.newGeoSearchArgs()
// 这个参数代表WITHDIST
.includeDistance()
// 表示第一条数据到第end条数据
.limit(end));
// TODO 4.解析出ShopID
if(results==null){
return Result.ok();
}
List<GeoResult<RedisGeoCommands.GeoLocation<String>>> list = results.getContent();
if (list.size()<from){
// 因为我们下面要执行skip操作,如果list集合中元素小于from的话,会出现sql异常
return Result.ok(Collections.emptyList());
}
// TODO 4.1 截取从from到end的数据
List<Long> ids = new ArrayList<>(list.size());
Map<String,Distance> distanceMap = new HashMap<>(list.size());
list.stream().skip(from).forEach(result->{
// TODO 4.2 获取店铺id
String shopIdStr = result.getContent().getName();
ids.add(Long.valueOf(shopIdStr));
// TODO 4.3 获取距离
Distance distance = result.getDistance();
distanceMap.put(shopIdStr,distance);
});
// TODO 5.根据id查询Shop
// 依然要保证有序
String idStr = StrUtil.join(",", ids);
List<Shop> shops = query().in("id", ids)
.last("order by FIELD(id," + idStr + ")")
.list();
for (Shop shop : shops) {
shop.setDistance(distanceMap.get(shop.getId().toString()).getValue());
}
// TODO 6.返回
return Result.ok(shops);
}
Renderings ranked by distance
2. User sign-in
2.1 BitMap
If we use a table to store user sign-in information, its structure is as follows
If there are 10 million users, and the average number of check-ins per person is 10, then the data volume of this table is 100 million in a year, and the record volume is very large.
We can count the user's check-in information on a monthly basis. The record of check-in is 1, and the record of non-sign-in is 0 . Then from the first day, record 0 or 1 sequentially, and the sign-in status of a month is displayed with a binary number string
Each bit corresponds to each day of the month to form a mapping relationship .
Use 0 and 1 to represent the business status, this idea is called bitmap (BitMap)
In Redis, the BitMap is implemented using the String type data structure, so the maximum limit is 512M, and the conversion to bit is 2^32 bits
The operation commands of BitMap are :
SETBIT : Store a 0 or 1 to the specified position (offset, subscript starts from 0)
If we don't set the value, the default is zero, so when we sign in, we store 1, and when we don't sign in, we don't need to save it.
GETBIT : Get the bit value at the specified position (offset)
BITCOUNT : Count the number of bits with a value of 1 in the BitMap
BITFIELD : operate (query, modify, self-increment) the value of the specified position (offset) in the bit array in BitMap
The type in GET type offset indicates how many bits to read and how many bits. It also needs to specify whether the returned result is signed or unsigned, because the final return result is returned in decimal
If signed, the first "1" or "0" represents the sign bit
Signed: GET i
Without sign: GET u
Suppose you get two bits without a sign, and start from 0: GET u2 0
BITFIELD_RO : Get the bit array in BitMap and return it in decimal form
BITOP : Perform bit operations (AND, OR, XOR) on the results of multiple BitMaps
BITPOS : Find the position of the first 0 or 1 in the specified range in the bit array
2.2 Check-in function
Requirements : Implement the sign-in interface, and save the current user's sign-in information on the day to Redis
Tip : Because the bottom layer of BitMap is based on the String data structure, its operations are also encapsulated in string-related operations
Controller layer
@PostMapping("/sign")
public Result sign(){
return userService.sign();
}
Service layer
// 用户签到功能
@Override
public Result sign() {
// TODO 1.获取当前登录用户
Long userId = UserHolder.getUser().getId();
// TODO 2.获取日期
LocalDateTime now = LocalDateTime.now();
// TODO 3.拼接key
String keySuffix = now.format(DateTimeFormatter.ofPattern(":yyyyMM"));
String key = "sign:"+userId+keySuffix;
// TODO 4.今天是本月第几天,就向那个bit位存值
int dayOfMonth = now.getDayOfMonth();
// TODO 5.写入Redis SETBIT key offset 0/1
stringRedisTemplate.opsForValue().setBit(key,dayOfMonth-1,true);
return Result.ok();
}
Counting from left to right, today is number 9
2.3 Count continuous check-ins
2.3.1 Analysis
Counting the total number of check-ins is not complicated, but counting the number of consecutive check-ins as of today is more complicated
Consecutive sign-in days : Count forward from the last sign-in until the first non-sign-in, and calculate the total number of sign-in times, which is the number of consecutive sign-in days.
How to get all the check-in data from this month to today?
BITFIELD key GET u[dataOfMonth] 0
The command can help us get all the data within the specified range
How to traverse from back to front bit by bit?
Do AND operation with 1 to get the last bit
Do an AND operation with 1, only if both are 1, the final result is 1
Then move one bit to the right, the next bit becomes the last bit, and the operation continues
2.3.2 Code implementation
Requirement : Implement the following interface to count the number of consecutive sign-in days of the current user in this month as of the current time
Controller layer
@GetMapping("/sign/count")
public Result signCount(){
return userService.signCount();
}
Service layer
// 获取连续签到天数
@Override
public Result signCount() {
// 1.获取当前登录用户
Long userId = UserHolder.getUser().getId();
// 2.获取日期
LocalDateTime now = LocalDateTime.now();
// 3.拼接key
String keySuffix = now.format(DateTimeFormatter.ofPattern(":yyyyMM"));
String key = "sign:" + userId + keySuffix;
// 4.今天是本月第几天,就向那个bit位存值
int dayOfMonth = now.getDayOfMonth();
// TODO 5.获取本月截止今天为止的所有签到记录(返回的是一个十进制的数字)
// 因为可以同时执行查询、修改、自增功能,那这样的话返回值也会有多个,所以最终是一个list集合
List<Long> result = stringRedisTemplate.opsForValue().bitField(key,
// BitFieldSubCommands.create() 创建子命令
BitFieldSubCommands.create()
// unsigned无符号, dayOfMonth表示截取多少bit位
.get(BitFieldSubCommands.BitFieldType.unsigned(dayOfMonth))
// 表示从0开始查
.valueAt(0)
);
if (result == null || result.isEmpty()) {
return Result.ok(0);
}
// 我们这只执行了查询,所以集合中只有一个元素
Long num = result.get(0);
if (num == null || num == 0) {
return Result.ok(0);
}
// TODO 6.循环遍历
int count =0;
while (true) {
// TODO 6.1 数字与1做与运算,得到数字的最后一个bit位
// TODO 6.2 判断这个bit位是否为0
if ((num & 1) == 0) {
// TODO 6.3如果为0,说明未签到,结束
break;
} else {
// TODO 6.4如果不为0,说明已签到,计数器+1
count++;
}
// TODO 6.5把数字右移动一位,抛弃最后一个bit位,继续下一个bit位
// 先右移一位,在赋值给num
num >>>=1;
}
return Result.ok(count);
}
3. UV statistics
First of all, we understand two concepts:
UV : The full name is Unique Visitor, also known as unique visitor volume , which refers to the natural person who visits and browses this webpage through the Internet. If the same user visits the website multiple times in one day, only one time will be recorded.
PV : The full name is Page View, also called page views or clicks . Every time a user visits a page of the website, one PV is recorded, and when the user opens the page multiple times, multiple PVs are recorded. Often used to measure website traffic .
PV tends to be much larger than UV
PV/UV can show the user viscosity of the website
It will be more troublesome to do UV statistics on the server side, because to judge whether the user has been counted, the user information that has been counted needs to be saved. But if every user who visits is saved in Redis, the amount of data will be terrible
The ideal solution is to use HyperLogLog
3.1 HyperLogLog usage
Hyperloglog (HLL) is a probabilistic algorithm derived from the Loglog algorithm for determining the cardinality of very large sets without storing all their values.
You can refer to the relevant algorithm principles: https://juejin.cn/post/6844903785744056333#heading-0
The HLL in Redis is implemented based on the string structure. The memory of a single HLL is always less than 16kb , and the memory usage is extremely low!
As a tradeoff, its measurements are probabilistic, with an error of less than 0.81%. But for UV statistics, this can be completely ignored.
three commands
-
**PFADD key element [element ...]** insert element
Insert five elements.
No matter how many repeated elements are added, he only records once. Naturally suitable for UV statistics
PFADD hl1 e1 e2 e3 e4 e5
- **PFCOUNT key [key ...]** total statistics
FCOUNT hl1
- PFMERGE destkey sourcekey [sourcekey …]
3.2 Statistics of testing millions of data
Directly use the unit test to send 1 million pieces of data to HyperLogLog to see how the memory usage and statistical effects are:
@Test
void testHyperLogLog() {
String[] values = new String[1000];
int j = 0;
for (int i = 0; i < 1000000; i++) {
j = i % 1000;
values[j] = "user_" + i;
if(j == 999){
// 发送到Redis
stringRedisTemplate.opsForHyperLogLog().add("hl2", values);
}
}
// 统计数量
Long count = stringRedisTemplate.opsForHyperLogLog().size("hl2");
System.out.println("count = " + count);
}
But the final return result is 997593, the error is still acceptable