电商大促一小时,网关 QPS 飙到 50w,下游订单服务被打爆。复盘发现限流策略错乱:全局 RateLimiter 单点成瓶颈、Redis 计数器有突刺、本地 Sentinel 不同步。本文把网关层多维度限流的工程方案讲清楚,Sliding Window + Token Bucket + Leaky Bucket + Sentinel 全覆盖,附 Spring Cloud Gateway 实战。
事故复盘
大促 12:00:00 开抢
12:00:01 网关 QPS 50w(预期 10w,流量超 5 倍)
12:00:01 网关限流:配的是 Redis 计数器,每秒 200000
12:00:01 Redis 单点 INCR + EXPIRE 每秒 100w ops 打满 cpu
12:00:02 限流失效,所有请求穿透到下游
12:00:05 订单服务连接池打满
12:00:10 整个交易链路雪崩
复盘:限流算法选错 + 单点设计 + 没有降级
限流算法对比
算法 突发支持 实现难度 适用场景
========================================
固定窗口 弱 简单 内部限流, 接口签名校验
滑动窗口 中 中 API 网关一般场景
漏桶 无 中 严格匀速 (比如调三方 API)
令牌桶 强 中 秒杀 / 抢购, 允许短期突发
固定窗口的突刺问题
配置:每秒最多 100 请求
时间 T-0.5s T+0.5s T+1.5s
请求数 100 0 100
↑ 窗口边界 ↓
但 T-0.5 到 T+0.5 = 1秒内有 200 请求!
固定窗口算法在窗口边界容易超额一倍
// 固定窗口 Redis 实现(有突刺)
public boolean tryAcquireFixed(String key, int limit, int windowSec) {
long now = System.currentTimeMillis() / 1000;
String windowKey = key + ":" + (now / windowSec);
Long count = redis.opsForValue().increment(windowKey);
if (count == 1) {
redis.expire(windowKey, windowSec);
}
return count <= limit;
}
滑动窗口实现
// 滑动窗口 Lua 脚本(原子)
public class SlidingWindowLimiter {
private static final String SCRIPT =
"local key = KEYS[1]\n" +
"local now = tonumber(ARGV[1])\n" +
"local window = tonumber(ARGV[2])\n" +
"local limit = tonumber(ARGV[3])\n" +
"\n" +
"-- 清除窗口外的旧记录\n" +
"redis.call('ZREMRANGEBYSCORE', key, 0, now - window)\n" +
"\n" +
"-- 当前窗口内的请求数\n" +
"local count = redis.call('ZCARD', key)\n" +
"if count >= limit then\n" +
" return 0\n" +
"end\n" +
"\n" +
"-- 加入当前请求\n" +
"redis.call('ZADD', key, now, now)\n" +
"redis.call('EXPIRE', key, math.ceil(window / 1000))\n" +
"return 1";
@Autowired private StringRedisTemplate redis;
public boolean tryAcquire(String key, int limit, int windowMs) {
long now = System.currentTimeMillis();
Long result = redis.execute(new DefaultRedisScript<>(SCRIPT, Long.class),
Collections.singletonList(key),
String.valueOf(now), String.valueOf(windowMs), String.valueOf(limit));
return result != null && result == 1;
}
}
缺点:Sorted Set 每个请求一个成员,内存开销 = O(限流 QPS × 窗口长度)。100w QPS × 1 秒窗口 = 内存里 100w 成员,撑不住。
滑动窗口优化:计数器分桶
// 1 秒窗口拆 10 个 100ms 子桶
// 内存 O(10),不再依赖 QPS
public class BucketedSlidingWindow {
private static final int BUCKETS = 10;
public boolean tryAcquire(String key, int limit, int windowMs) {
long bucketSizeMs = windowMs / BUCKETS;
long now = System.currentTimeMillis();
long currentBucket = now / bucketSizeMs;
String script =
"local total = 0\n" +
"for i = 0, " + (BUCKETS - 1) + " do\n" +
" local bucket = tonumber(ARGV[1]) - i\n" +
" local val = redis.call('HGET', KEYS[1], tostring(bucket))\n" +
" if val then total = total + tonumber(val) end\n" +
"end\n" +
"if total >= tonumber(ARGV[2]) then return 0 end\n" +
"redis.call('HINCRBY', KEYS[1], ARGV[1], 1)\n" +
"redis.call('EXPIRE', KEYS[1], " + (windowMs / 1000 + 1) + ")\n" +
"return 1";
Long ok = redis.execute(new DefaultRedisScript<>(script, Long.class),
Collections.singletonList(key),
String.valueOf(currentBucket), String.valueOf(limit));
return ok != null && ok == 1;
}
}
令牌桶 (Token Bucket)
固定速率往桶里放令牌,请求来了拿一个,桶满则丢弃,桶空则限流。允许短期突发(桶里攒着的令牌)。
// Guava RateLimiter(单机)
RateLimiter limiter = RateLimiter.create(100); // 每秒 100 个令牌
if (limiter.tryAcquire()) {
process();
} else {
return tooBusy();
}
// 平滑突发模式(默认):新启动会有 warmup
RateLimiter warmup = RateLimiter.create(100, 2, TimeUnit.SECONDS);
// 启动 2 秒内,实际速率从 50 慢慢升到 100
// 分布式令牌桶:Redis Lua 实现
public class RedisTokenBucket {
private static final String SCRIPT =
"local key = KEYS[1]\n" +
"local now = tonumber(ARGV[1])\n" +
"local rate = tonumber(ARGV[2])\n" +
"local capacity = tonumber(ARGV[3])\n" +
"local requested = tonumber(ARGV[4])\n" +
"\n" +
"local data = redis.call('HMGET', key, 'tokens', 'last_refill')\n" +
"local tokens = tonumber(data[1]) or capacity\n" +
"local last_refill = tonumber(data[2]) or now\n" +
"\n" +
"-- 补充令牌\n" +
"local elapsed = now - last_refill\n" +
"tokens = math.min(capacity, tokens + elapsed * rate / 1000)\n" +
"\n" +
"if tokens < requested then\n" +
" redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)\n" +
" redis.call('EXPIRE', key, 60)\n" +
" return 0\n" +
"end\n" +
"\n" +
"tokens = tokens - requested\n" +
"redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)\n" +
"redis.call('EXPIRE', key, 60)\n" +
"return 1";
public boolean tryAcquire(String key, double rate, double capacity, int requested) {
long now = System.currentTimeMillis();
Long result = redis.execute(new DefaultRedisScript<>(SCRIPT, Long.class),
Collections.singletonList("rate_limiter:" + key),
String.valueOf(now), String.valueOf(rate),
String.valueOf(capacity), String.valueOf(requested));
return result != null && result == 1;
}
}
Spring Cloud Gateway 限流
spring:
cloud:
gateway:
routes:
- id: order_route
uri: lb://order-service
predicates:
- Path=/api/order/**
filters:
- name: RequestRateLimiter
args:
key-resolver: "#{@userKeyResolver}"
redis-rate-limiter.replenishRate: 100 # 每秒补充 100 令牌
redis-rate-limiter.burstCapacity: 200 # 桶容量 200,允许突发
redis-rate-limiter.requestedTokens: 1 # 每请求拿 1 个
@Configuration
public class GatewayLimitConfig {
// 按用户 ID 限流
@Bean
public KeyResolver userKeyResolver() {
return exchange -> {
String userId = exchange.getRequest().getHeaders().getFirst("X-User-Id");
if (userId == null) userId = "anonymous_" + getClientIp(exchange);
return Mono.just("user:" + userId);
};
}
// 按 IP 限流
@Bean
public KeyResolver ipKeyResolver() {
return exchange -> Mono.just("ip:" + getClientIp(exchange));
}
// 按 API 路径限流
@Bean
public KeyResolver apiKeyResolver() {
return exchange -> Mono.just("api:" + exchange.getRequest().getPath().value());
}
private String getClientIp(ServerWebExchange exchange) {
String xff = exchange.getRequest().getHeaders().getFirst("X-Forwarded-For");
if (xff != null) return xff.split(",")[0].trim();
return exchange.getRequest().getRemoteAddress().getAddress().getHostAddress();
}
}
多维度叠加限流
// 一个请求要过三层限流:IP / 用户 / API
public class MultiDimensionFilter implements GlobalFilter {
@Autowired private RedisTokenBucket limiter;
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
String ip = getClientIp(exchange);
String userId = getUserId(exchange);
String api = exchange.getRequest().getPath().value();
// IP 维度:防 DDoS,粗粒度
if (!limiter.tryAcquire("ip:" + ip, 200, 400, 1)) {
return reject(exchange, "rate_limit_ip");
}
// 用户维度:防刷,中粒度
if (userId != null && !limiter.tryAcquire("user:" + userId, 50, 100, 1)) {
return reject(exchange, "rate_limit_user");
}
// API 维度:保护核心接口,细粒度
int apiLimit = getApiLimit(api); // 配置中心动态读
if (!limiter.tryAcquire("api:" + api, apiLimit, apiLimit * 2, 1)) {
return reject(exchange, "rate_limit_api");
}
return chain.filter(exchange);
}
private Mono<Void> reject(ServerWebExchange exchange, String reason) {
exchange.getResponse().setStatusCode(HttpStatus.TOO_MANY_REQUESTS);
exchange.getResponse().getHeaders().add("X-RateLimit-Reason", reason);
return exchange.getResponse().setComplete();
}
}
Sentinel:更专业的方案
<dependency>
<groupId>com.alibaba.cloud</groupId>
<artifactId>spring-cloud-starter-alibaba-sentinel</artifactId>
</dependency>
@RestController
public class OrderController {
@SentinelResource(value = "createOrder",
blockHandler = "createOrderBlocked",
fallback = "createOrderFallback")
@PostMapping("/order")
public OrderDto createOrder(@RequestBody CreateOrderReq req) {
return orderService.create(req);
}
// 限流降级
public OrderDto createOrderBlocked(CreateOrderReq req, BlockException ex) {
log.warn("blocked: {}", ex.getClass().getSimpleName());
throw new BusinessException("too_busy");
}
// 业务异常降级
public OrderDto createOrderFallback(CreateOrderReq req, Throwable t) {
return OrderDto.placeholder("retry_later");
}
}
Sentinel 提供的能力:
- 多种限流模式:QPS / 线程数 / 关联资源 / 链路
- 降级:慢调用比例 / 异常比例 / 异常数
- 系统保护:CPU / Load / 入口 QPS / 线程数
- 热点参数限流:针对某个参数的某个值单独限流(防止单一商品被刷)
- 控制台动态配置,无需重启
热点参数限流(秒杀场景)
@SentinelResource(value = "secKill", blockHandler = "blocked")
public OrderDto secKill(@SentinelParam(index = 0) Long itemId, Long userId) {
return orderService.secKill(itemId, userId);
}
// Sentinel 控制台配置:
// 资源: secKill
// 参数索引: 0(itemId)
// 限流: QPS > 100 拒绝
// 热点参数例外:
// itemId = 1001 (爆款) 限流 QPS = 5000
// itemId = 1002 限流 QPS = 1000
// 效果:同一爆款商品高并发也能精细控制
降级策略 + 熔断
// 慢调用比例降级
DegradeRule slowCallRule = new DegradeRule("queryItem")
.setGrade(RuleConstant.DEGRADE_GRADE_RT)
.setCount(200) // RT > 200ms 算慢
.setSlowRatioThreshold(0.5) // 慢调用比例 > 50% 触发
.setTimeWindow(10) // 熔断 10 秒
.setStatIntervalMs(1000)
.setMinRequestAmount(20);
// 异常比例降级
DegradeRule errorRatioRule = new DegradeRule("createOrder")
.setGrade(RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO)
.setCount(0.3) // 异常率 > 30% 触发
.setTimeWindow(10)
.setMinRequestAmount(20);
DegradeRuleManager.loadRules(Arrays.asList(slowCallRule, errorRatioRule));
压测对比
改造前(单 Redis INCR 计数器):
- 50w QPS 入口 → Redis 100w ops/s → cpu 100%
- 限流失效,下游被打爆
- 5 分钟整体雪崩
改造后(多维度 + Token Bucket Lua + Sentinel 系统保护):
- 50w QPS 入口
- IP 维度过滤掉异常 IP 流量 → 剩 30w
- 用户维度防刷 → 剩 20w
- API 维度精细控制 → 剩 8w(接近目标)
- Redis Lua 单次调用 0.3ms,总 ops 30w/s,可承载
- 系统保护:CPU > 80% 主动拒绝多余流量
- 业务实际命中 8w QPS,接近设计容量
- 整体平稳,无雪崩
核对清单
- 限流算法选对:秒杀用令牌桶,严格匀速用漏桶,普通 API 用滑动窗口
- 多维度叠加:IP / 用户 / API / 租户
- Lua 脚本原子操作,避免 INCR + EXPIRE 竞态
- 限流维度的 key 设计有 TTL,防止 Redis 内存膨胀
- 限流被触发要打 metrics(分维度分原因)
- 核心接口配 Sentinel 慢调用降级 + 异常比例降级
- 系统保护:CPU / Load / 入口 QPS 兜底
- 大促前做限流策略评审 + 压测验证
网关限流不是"配个数字"那么简单,大促时各种异常场景都会出现:Redis 瓶颈、单点故障、突刺、热点 key。我们在大促前会演练 4-5 次故障注入,每次都能发现新的薄弱环节。这套多层防御体系跑了 3 个大促周期,系统稳定性从 99.5% 提升到 99.99%。
—— 别看了 · 2026