需求背景
假设有个动态的ID过滤规则,规则包含多个需要过滤的ID(用逗号分开)以及ID区间(用-代表区间);
实现全局过滤器,对入参的ID判断是否命中过滤规则;
ID为long类型,配置中心返回的规则是string类型。
规则示例:
1,3,5,7-10
规则意思需要过滤 1,3,5,7,8,9,10
实现代码
咋一看需求挺简单,下面实现方式也直接简单
- 读取配置
- 解析配置
- 规则判断
//配置中心,用于读取规则配置
private Config config = ConfigService.getConfig();
/**
* 过滤器主方法
*/
public boolean isMatch(long id) {
if (id<= 0)
return false;
String configStr = config.getProperty("idRule", "");
if (StringUtils.isBlank(whitList))
return false;
return Stream.of(configStr.replaceAll("\\s", "").split(","))
.anyMatch(range -> {
if (range.contains("-")) {
String[] boundaries = range.split("-", 2);
if (NumberUtils.isDigits(boundaries[0]) && NumberUtils.isDigits(boundaries[1])) {
long floor = NumberUtils.toLong(boundaries[0]);
long ceiling = NumberUtils.toLong(boundaries[1]);
return floor <= id && id <= ceiling;
}
}
return range.equals(String.valueOf(id));
});
}
对以上实现进行基准测试,测试用例mock的规则:“1, 3, 5, 7, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100-120, 200-220, 300-320”
@BenchmarkMode(Mode.SampleTime)
@Warmup(iterations = 2, time = 2)
@Measurement(iterations = 5, time = 5)
@Threads(1)
@Fork(1)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Benchmark
public void testMatchByStr() {
isMatch(RandomUtils.nextInt(1, 10000));
}
测试结果
Benchmark Mode Cnt Score Error Units
BenchmarkTest.testMatchByStr sample 867795 1843.805 ± 21.407 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.00 sample 1000.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.50 sample 1800.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.90 sample 1800.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.95 sample 1900.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.99 sample 2900.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.999 sample 4696.000 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.9999 sample 24021.158 ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p1.00 sample 1243136.000 ns/op
结果看上去还不错,99.99%都在0.02毫秒内。
考虑到这是一个全局的过滤器,承载了全站的所有请求,并且是高并发业务,计算速度应该在允许的条件下追求最快,于是陷入思考,是否可以有更优实现?计算已经很简单了,貌似没有可以删减的;仔细看字符串解析部分计算是不是可以提前做?不需要放在每次的过滤判断中,下面试着分离计算:
- 初始化时就解析配置为List+Map,并监听配置变更
- 过滤器使用已解析好的集合进行规则判断
//这里实际是使用了apollo的配置变更事件,提前将字符串转换成为集合
private Set<Long> configArray = new HashSet<>();
private Map<Long, Long> configRange = new HashMap<>();
/**
* 在初始化中读取并监听变更,提前把配置解析
*/
public void init() {
String configStr = config.getProperty("idRule", "");
resolveConfig(configStr );
config.addChangeListener(configChangeEvent -> {
if (configChangeEvent.isChanged("idRule")) {
resolveConfig(config.getProperty("idRule", ""));
}
});
}
/**
* 解析apollo配置
*/
private void resolveConfig(String configStr) {
Set<Long> tmpConfigArray = new HashSet<>();
Map<Long, Long> tmpConfigRange = new HashMap<>();
if (StringUtils.isNotBlank(configStr)) {
Stream.of(configStr.replaceAll("\\s", "").split(","))
.forEach(range -> {
if (range.contains("-")) {
String[] boundaries = range.split("-", 2);
if (NumberUtils.isDigits(boundaries[0]) && NumberUtils.isDigits(boundaries[1])) {
long floor = NumberUtils.toLong(boundaries[0]);
long ceiling = NumberUtils.toLong(boundaries[1]);
tmpConfigRange.put(floor, ceiling);
}
} else {
tmpConfigArray.add(NumberUtils.toLong(range));
}
});
}
this.configArray = tmpConfigArray;
this.configRange = tmpConfigRange;
}
/**
* 过滤器主方法
*/
public boolean isMatch(long id) {
if (id<= 0)
return false;
if (configArray.size() == 0 && configRange.size() == 0) {
return false;
}
if (configArray.size() > 0) {
if(configArray.contains(id))
return true;
}
if (configRange.size() > 0) {
return configRange.entrySet().stream()
.anyMatch(entry -> entry.getKey() <= id && id <= entry.getValue());
}
return false;
}
对以上实现进行基准测试,mock同样的规则:“1, 3, 5, 7, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100-120, 200-220, 300-320”
测试规则
@BenchmarkMode(Mode.SampleTime)
@Warmup(iterations = 2, time = 2)
@Measurement(iterations = 5, time = 5)
@Threads(1)
@Fork(1)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Benchmark
public void testMatchByList() {
isMatch(RandomUtils.nextInt(1, 10000));
}
测试结果
Benchmark Mode Cnt Score Error Units
BenchmarkTest.testMatchByList sample 581645 67.503 ± 5.858 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.00 sample ≈ 0 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.50 sample 100.000 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.90 sample 100.000 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.95 sample 100.000 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.99 sample 100.000 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.999 sample 300.000 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.9999 sample 3283.540 ns/op
BenchmarkTest.testMatchByList:testMatchByList·p1.00 sample 815104.000 ns/op
从基准结果看,快了10倍。在耗时很低的情况下,依然降低了10倍,虽然只是节省1000ns/op;不积跬步,无以至千里;如果能在每个业务开发时都稍微思考节省一点点,对于高并发系统的性能提升还是很有价值的。
总结:在高并发系统中应尽量将用户请求链中的计算做到最精简,当然需要考虑下投入产出比。