Keeping a Lua WAF under 5 ms at 100k rps

ScaleShield is the WAF and bot-defence layer in front of every site on G7Cloud and plenty that sit elsewhere. It runs as OpenResty (nginx + Lua), binds :80/:443, and every request passes through a Lua pipeline before it gets proxied to origin. The whole pipeline has to cost less than a few milliseconds — anything more and the edge stops being an asset and becomes a tax.

This post is about the one thing that makes the whole budget possible: how WAF state is read at request time.

The request-path budget

In access_by_lua a ScaleShield request walks through, roughly:

backend resolution (host → origin)
protection-level bypass checks
HEAD passthrough and challenge-cookie issuance
global WAF enabled flag, HTTPS redirect
imgproxy and infrastructure bypass
IP ban, ASN allow, IP allow
WAF rules, path ACL, honeypot
bot score / UA rules / challenge

Every one of those steps needs configuration: rule lists, allow/deny sets, per-hostname settings, bot-score thresholds, active challenges. If we went to Redis for every check, a single request would do a dozen round-trips.

The three-tier Lua cache

The answer is a three-tier cache inside nginx itself. Every config lookup goes through the same helper, which checks each layer in order:

Tier 1 — lua_shared_dict fresh cache, 5 s TTL. In-process shared memory. No syscalls, no network. Sub-microsecond lookups.
Tier 2 — lua_shared_dict stale cache, 7-day TTL. Fallback used only if Redis is unavailable. Keeps the edge running through Redis blips.
Tier 3 — persistent on-disk cache. Survives container restarts, so a fresh container doesn't start with an empty WAF.

Only when all three tiers miss does a worker go to local Redis — and even that has a hard 200 ms timeout. Fail fast, serve stale, let the sync loop catch up.

Why 5 seconds of staleness is fine

If you've worked on WAFs you've probably had this argument: "shouldn't rule changes be instant?" In practice, no. A 5-second window between dashboard click and edge enforcement is negligible next to the cost of doing Redis round-trips on every request. If a change truly needs to be immediate (e.g. an emergency block), there's a separate key that skips the fresh-cache TTL.

The important property is that no nginx reload is needed for rule changes. TTL expiry handles propagation automatically. That's what makes the system operable.

The sync loop

A local sync worker polls the central API every 30 seconds and writes everything the WAF needs into local Redis: global config, rule sets, backends, per-domain settings, IP allow/deny, ASN lists. Local Redis runs on the loopback interface with network_mode: host, so there's no docker networking hop.

This design deliberately decouples the dashboard from the edge. The dashboard can be down, the central API can be down — the WAF keeps serving off its local Redis + tiered cache.

What we traded away

This isn't free. The tradeoffs:

Instant enforcement is gone by default. The system has a "fast path" for panic-button changes, but 99% of config updates take up to the TTL to land.
More moving parts. You now have a central DB, a local Redis, a sync worker, and a tiered in-process cache to reason about. The dual-Redis design (central API Redis vs. nginx-local Redis) has its own category of bugs if you're not careful.
Stale-serving under API failure is a feature, not a bug.That's an opinion worth being explicit about.

What it buys you

Per-request overhead stays comfortably inside the budget at production load. Redis can be restarted, the API can be redeployed, the dashboard can be down — the edge keeps enforcing exactly what it was enforcing a few seconds ago. That's the property you want from a WAF.