最近项目上需要在zuul网关进行限流,查阅了一些限流方案,比较好的方案是使用redis进行分布式限流,主要考虑到redis性能好,且redis本身是单线程操作,解决了并发问题。目前在github上开源的Spring Cloud Zuul RateLimit 就刚好满足了这些需求:https://github.com/marcosbarbero/spring-cloud-zuul-ratelimit,进而在项目中进入了这个组件,但是在性能压测的时候却发现性能表现不是很好,既然有问题,那就只能想办法解决问题或优化。
把源码拉下来研究源码是最直接的方式,说重点,我们在使用redis进行限流时,主要用到的是RedisRateLimiter这个类,而RedisRateLimiter继承了抽象类AbstractCacheRateLimiter,上代码:
public abstract class AbstractCacheRateLimiter implements RateLimiter {
@Override
public synchronized Rate consume(Policy policy, String key, Long requestTime) {
final Long refreshInterval = policy.getRefreshInterval();
final Long quota = policy.getQuota() != null ? SECONDS.toMillis(policy.getQuota()) : null;
final Rate rate = new Rate(key, policy.getLimit(), quota, null, null);
calcRemainingLimit(policy.getLimit(), refreshInterval, requestTime, key, rate);
calcRemainingQuota(quota, refreshInterval, requestTime, key, rate);
return rate;
}
protected abstract void calcRemainingLimit(Long limit, Long refreshInterval, Long requestTime, String key, Rate rate);
protected abstract void calcRemainingQuota(Long quota, Long refreshInterval, Long requestTime, String key, Rate rate);
}
RedisRateLimiter
public class RedisRateLimiter extends AbstractCacheRateLimiter {
private final RateLimiterErrorHandler rateLimiterErrorHandler;
private final RedisTemplate redisTemplate;
@Override
protected void calcRemainingLimit(Long limit, Long refreshInterval,
Long requestTime, String key, Rate rate) {
long begTime = System.currentTimeMillis();
if (limit != null) {
handleExpiration(key, refreshInterval, rate);
long usage = requestTime == null ? 1L : 0L;
Long current = 0L;
try {
current = redisTemplate.opsForValue().increment(key, usage);
// current = this.redisTemplate.boundValueOps(key).increment(usage);
if (1 == current) {
this.redisTemplate.expire(key, refreshInterval, SECONDS);
}
} catch (RuntimeException e) {
System.err.println("increment error ====>" + e.getMessage());
String msg = "Failed retrieving rate for " + key + ", will return limit";
rateLimiterErrorHandler.handleError(msg, e);
}
System.err.println("current===>" + current);
rate.setRemaining(Math.max(-1, limit - current));
}
System.err.println("use time:" + (System.currentTimeMillis() - begTime) + "ms");
}
@Override
protected void calcRemainingQuota(Long quota, Long refreshInterval,
Long requestTime, String key, Rate rate) {
if (quota != null) {
String quotaKey = key + QUOTA_SUFFIX;
handleExpiration(quotaKey, refreshInterval, rate);
Long usage = requestTime != null ? requestTime : 0L;
Long current = 0L;
try {
current = this.redisTemplate.boundValueOps(quotaKey).increment(usage);
} catch (RuntimeException e) {
String msg = "Failed retrieving rate for " + quotaKey + ", will return quota limit";
rateLimiterErrorHandler.handleError(msg, e);
}
rate.setRemainingQuota(Math.max(-1, quota - current));
}
}
private void handleExpiration(String key, Long refreshInterval, Rate rate) {
Long expire = null;
try {
expire = this.redisTemplate.getExpire(key);
if (expire == null || expire == -1) {
this.redisTemplate.expire(key, refreshInterval, SECONDS);
expire = refreshInterval;
}
} catch (RuntimeException e) {
String msg = "Failed retrieving expiration for " + key + ", will reset now";
rateLimiterErrorHandler.handleError(msg, e);
}
rate.setReset(SECONDS.toMillis(expire == null ? 0L : expire));
}
}
我们最关注的是calcRemainingLimit和calcRemainingQuota这两个方法的实现,分析发现,具体实现里面都调用了handleExpiration方法,而handleExpiration的逻辑就是检查KEY的过期时间,如果没有设置就为其设置过期时间:redisTemplate.getExpire(key),而这个逻辑在每次限流请求都会进行检查,就是每次限流请求是都会先检查key过期时间再进行自增操作,能不能只在首次请求时设置过期时间,故对代码进行优化:
protected void calcRemainingLimit(Long limit, Long refreshInterval,
Long requestTime, String key, Rate rate) {
if (limit != null) {
// handleExpiration(key, refreshInterval, rate);
long usage = requestTime == null ? 1L : 0L;
Long current = 0L;
try {
current = redisTemplate.opsForValue().increment(key, usage);
// current = this.redisTemplate.boundValueOps(key).increment(usage);
if (1 == current) {
// 首次对key进行限流check时设置过期时间
this.redisTemplate.expire(key, refreshInterval, SECONDS);
}
} catch (RuntimeException e) {
String msg = "Failed retrieving rate for " + key + ", will return limit";
rateLimiterErrorHandler.handleError(msg, e);
}
rate.setRemaining(Math.max(-1, limit - current));
}
}
优化后进行压测:
压测环境:macos /4c8g
redis使用的阿里云的redis(有网络开销)
压测工具使用siege,这个工具压测比较方便,测试结果仅供参考
配置
default-policy:
limit: 100
refresh-interval: 10
siege -c 200 -r 1 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 200 hits
Availability: 100.00 %
Elapsed time: 8.27 secs
Data transferred: 0.01 MB
Response time: 4.52 secs
Transaction rate: 24.18 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 109.33
Successful transactions: 101
Failed transactions: 0
Longest transaction: 8.16
Shortest transaction: 1.19
siege -c 500 -r 1 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 255 hits
Availability: 100.00 %
Elapsed time: 8.55 secs
Data transferred: 0.02 MB
Response time: 4.26 secs
Transaction rate: 29.82 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 127.17
Successful transactions: 100
Failed transactions: 0
Longest transaction: 8.34
Shortest transaction: 0.03
siege -c 200 -r 10 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 2000 hits
Availability: 100.00 %
Elapsed time: 84.28 secs
Data transferred: 0.16 MB
Response time: 8.01 secs
Transaction rate: 23.73 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 190.12
Successful transactions: 876
Failed transactions: 0
Longest transaction: 21.21
Shortest transaction: 0.05
平均每秒处理请求25.91 req/s
优化后
siege -c 200 -r 1 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 200 hits
Availability: 100.00 %
Elapsed time: 3.56 secs
Data transferred: 0.02 MB
Response time: 1.69 secs
Transaction rate: 56.18 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 94.92
Successful transactions: 72
Failed transactions: 0
Longest transaction: 3.46
Shortest transaction: 0.03
siege -c 500 -r 1 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 255 hits
Availability: 100.00 %
Elapsed time: 3.70 secs
Data transferred: 0.02 MB
Response time: 1.72 secs
Transaction rate: 62.92 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 118.21
Successful transactions: 100
Failed transactions: 0
Longest transaction: 3.45
Shortest transaction: 0.01
siege -c 200 -r 10 http://localhost:8088/user-service/user/say
测试结果===>
Transactions: 2000 hits
Availability: 100.00 %
Elapsed time: 30.45 secs
Data transferred: 0.23 MB
Response time: 2.82 secs
Transaction rate: 65.68 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 185.42
Successful transactions: 324
Failed transactions: 0
Longest transaction: 6.43
Shortest transaction: 0.02
优化后:
61.59 req/s
性能约提升了1~3倍,基本满足我们的性能要求,这里主要是记录下这个问题,如有其它网友遇到类似问题,可以对解决方案进行讨论,这个问题在github上提了个issue,等作者回复
https://github.com/marcosbarbero/spring-cloud-zuul-ratelimit/issues/103