一个关于全局ID过滤工具的性能优化实践

需求背景

假设有个动态的ID过滤规则,规则包含多个需要过滤的ID(用逗号分开)以及ID区间(用-代表区间);
实现全局过滤器,对入参的ID判断是否命中过滤规则;
ID为long类型,配置中心返回的规则是string类型。

规则示例:
1,3,5,7-10

规则意思需要过滤 1,3,5,7,8,9,10

实现代码

咋一看需求挺简单,下面实现方式也直接简单

  1. 读取配置
  2. 解析配置
  3. 规则判断
//配置中心,用于读取规则配置
private Config config = ConfigService.getConfig();
/**
* 过滤器主方法
*/
public boolean isMatch(long id) {
        if (id<= 0)
            return false;
            
        String configStr = config.getProperty("idRule", "");
        if (StringUtils.isBlank(whitList))
            return false;
    
        return Stream.of(configStr.replaceAll("\\s", "").split(","))
                .anyMatch(range -> {
                    if (range.contains("-")) {
                        String[] boundaries = range.split("-", 2);
                        if (NumberUtils.isDigits(boundaries[0]) && NumberUtils.isDigits(boundaries[1])) {
                            long floor = NumberUtils.toLong(boundaries[0]);
                            long ceiling = NumberUtils.toLong(boundaries[1]);
                            return floor <= id && id <= ceiling;
                        }
                    }
                    return range.equals(String.valueOf(id));
                });
    }

对以上实现进行基准测试,测试用例mock的规则:“1, 3, 5, 7, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100-120, 200-220, 300-320”

    @BenchmarkMode(Mode.SampleTime)
    @Warmup(iterations = 2, time = 2)
    @Measurement(iterations = 5, time = 5)
    @Threads(1)
    @Fork(1)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @Benchmark
    public void testMatchByStr() {    	
        isMatch(RandomUtils.nextInt(1, 10000));
    }

测试结果

Benchmark                                            Mode     Cnt        Score    Error  Units
BenchmarkTest.testMatchByStr                         sample  867795     1843.805 ± 21.407  ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.00    sample             1000.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.50    sample             1800.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.90    sample             1800.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.95    sample             1900.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.99    sample             2900.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.999   sample             4696.000           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p0.9999  sample            24021.158           ns/op
BenchmarkTest.testMatchByStr:testMatchByStr·p1.00    sample          1243136.000           ns/op

结果看上去还不错,99.99%都在0.02毫秒内。

考虑到这是一个全局的过滤器,承载了全站的所有请求,并且是高并发业务,计算速度应该在允许的条件下追求最快,于是陷入思考,是否可以有更优实现?计算已经很简单了,貌似没有可以删减的;仔细看字符串解析部分计算是不是可以提前做?不需要放在每次的过滤判断中,下面试着分离计算:

  1. 初始化时就解析配置为List+Map,并监听配置变更
  2. 过滤器使用已解析好的集合进行规则判断
//这里实际是使用了apollo的配置变更事件,提前将字符串转换成为集合
private Set<Long> configArray = new HashSet<>();
private Map<Long, Long> configRange = new HashMap<>();
    
/**
 * 在初始化中读取并监听变更,提前把配置解析
 */
public void init() {
    String configStr = config.getProperty("idRule", "");
    resolveConfig(configStr );

    config.addChangeListener(configChangeEvent -> {
        if (configChangeEvent.isChanged("idRule")) {
            resolveConfig(config.getProperty("idRule", ""));
        }
    });
}

/**
 * 解析apollo配置
 */
private void resolveConfig(String configStr) {
    Set<Long> tmpConfigArray = new HashSet<>();
    Map<Long, Long> tmpConfigRange = new HashMap<>();

    if (StringUtils.isNotBlank(configStr)) {
        Stream.of(configStr.replaceAll("\\s", "").split(","))
                .forEach(range -> {
                    if (range.contains("-")) {
                        String[] boundaries = range.split("-", 2);
                        if (NumberUtils.isDigits(boundaries[0]) && NumberUtils.isDigits(boundaries[1])) {
                            long floor = NumberUtils.toLong(boundaries[0]);
                            long ceiling = NumberUtils.toLong(boundaries[1]);
                            tmpConfigRange.put(floor, ceiling);
                        }
                    } else {
                        tmpConfigArray.add(NumberUtils.toLong(range));
                    }
                });
    }

    this.configArray = tmpConfigArray;
    this.configRange = tmpConfigRange;
}

/**
* 过滤器主方法
*/
public boolean isMatch(long id) {
	  if (id<= 0)
	      return false;
	
	  if (configArray.size() == 0 && configRange.size() == 0) {
	      return false;
	  }
	
	  if (configArray.size() > 0) {
	      if(configArray.contains(id))
	          return true;
	  }
	
	  if (configRange.size() > 0) {
	      return configRange.entrySet().stream()
	              .anyMatch(entry -> entry.getKey() <= id && id <= entry.getValue());
	  }
	
	  return false;
}

对以上实现进行基准测试,mock同样的规则:“1, 3, 5, 7, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100-120, 200-220, 300-320”

测试规则

	@BenchmarkMode(Mode.SampleTime)
    @Warmup(iterations = 2, time = 2)
    @Measurement(iterations = 5, time = 5)
    @Threads(1)
    @Fork(1)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @Benchmark
    public void testMatchByList() {
        isMatch(RandomUtils.nextInt(1, 10000));
    }

测试结果

Benchmark                                              Mode     Cnt       Score   Error  Units
BenchmarkTest.testMatchByList                          sample  581645      67.503 ± 5.858  ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.00    sample                 ≈ 0          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.50    sample             100.000          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.90    sample             100.000          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.95    sample             100.000          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.99    sample             100.000          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.999   sample             300.000          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p0.9999  sample            3283.540          ns/op
BenchmarkTest.testMatchByList:testMatchByList·p1.00    sample          815104.000          ns/op

从基准结果看,快了10倍。在耗时很低的情况下,依然降低了10倍,虽然只是节省1000ns/op;不积跬步,无以至千里;如果能在每个业务开发时都稍微思考节省一点点,对于高并发系统的性能提升还是很有价值的。

总结:在高并发系统中应尽量将用户请求链中的计算做到最精简,当然需要考虑下投入产出比。

猜你喜欢

转载自blog.csdn.net/weixin_43983762/article/details/105737496