Keeping a Lua WAF under 5 ms at 100k rps
How ScaleShield's OpenResty WAF keeps per-request overhead inside a single-digit-ms budget using a three-tier Lua cache and a 200 ms Redis timeout.
2026-03-28 · 8 min read
ScaleShield is the WAF and bot-defence layer in front of every site on G7Cloud and plenty that sit elsewhere. It runs as OpenResty (nginx + Lua), binds :80/:443, and every request passes through a Lua pipeline before it gets proxied to origin. The whole pipeline has to cost less than a few milliseconds — anything more and the edge stops being an asset and becomes a tax.
This post is about the one thing that makes the whole budget possible: how WAF state is read at request time.
The request-path budget
In access_by_lua a ScaleShield request walks through, roughly:
- backend resolution (host → origin)
- protection-level bypass checks
- HEAD passthrough and challenge-cookie issuance
- global WAF enabled flag, HTTPS redirect
- imgproxy and infrastructure bypass
- IP ban, ASN allow, IP allow
- WAF rules, path ACL, honeypot
- bot score / UA rules / challenge
Every one of those steps needs configuration: rule lists, allow/deny sets, per-hostname settings, bot-score thresholds, active challenges. If we went to Redis for every check, a single request would do a dozen round-trips.
The three-tier Lua cache
The answer is a three-tier cache inside nginx itself. Every config lookup goes through the same helper, which checks each layer in order:
- Tier 1 —
lua_shared_dictfresh cache, 5 s TTL. In-process shared memory. No syscalls, no network. Sub-microsecond lookups. - Tier 2 —
lua_shared_dictstale cache, 7-day TTL. Fallback used only if Redis is unavailable. Keeps the edge running through Redis blips. - Tier 3 — persistent on-disk cache. Survives container restarts, so a fresh container doesn't start with an empty WAF.
Only when all three tiers miss does a worker go to local Redis — and even that has a hard 200 ms timeout. Fail fast, serve stale, let the sync loop catch up.
Why 5 seconds of staleness is fine
If you've worked on WAFs you've probably had this argument: "shouldn't rule changes be instant?" In practice, no. A 5-second window between dashboard click and edge enforcement is negligible next to the cost of doing Redis round-trips on every request. If a change truly needs to be immediate (e.g. an emergency block), there's a separate key that skips the fresh-cache TTL.
The important property is that no nginx reload is needed for rule changes. TTL expiry handles propagation automatically. That's what makes the system operable.
The sync loop
A local sync worker polls the central API every 30 seconds and writes everything the WAF needs into local Redis: global config, rule sets, backends, per-domain settings, IP allow/deny, ASN lists. Local Redis runs on the loopback interface with network_mode: host, so there's no docker networking hop.
This design deliberately decouples the dashboard from the edge. The dashboard can be down, the central API can be down — the WAF keeps serving off its local Redis + tiered cache.
What we traded away
This isn't free. The tradeoffs:
- Instant enforcement is gone by default. The system has a "fast path" for panic-button changes, but 99% of config updates take up to the TTL to land.
- More moving parts. You now have a central DB, a local Redis, a sync worker, and a tiered in-process cache to reason about. The dual-Redis design (central API Redis vs. nginx-local Redis) has its own category of bugs if you're not careful.
- Stale-serving under API failure is a feature, not a bug.That's an opinion worth being explicit about.
What it buys you
Per-request overhead stays comfortably inside the budget at production load. Redis can be restarted, the API can be redeployed, the dashboard can be down — the edge keeps enforcing exactly what it was enforcing a few seconds ago. That's the property you want from a WAF.