Strategies

How Suppression Works

A practical explanation of suppression factor, hard limits, and the main tuning knobs.

This page is the operational view of suppressed mode: what the numbers mean and which knobs matter.

Core idea

Suppressed mode does not wait for a hard boundary and then reject everything. Instead, it computes a suppression factor from current pressure and uses that to decide whether each request is admitted.

The higher the pressure, the higher the chance that a request is denied.

The two knobs that matter most

hard_limit_factor

This caps how far past the target rate a key can go before suppression effectively becomes total.

  • lower values make shedding become severe sooner
  • higher values allow more burst headroom before the strategy clamps down fully

suppression_factor_cache_ms

Suppression factor computation can be cached briefly.

  • lower values track pressure changes more closely
  • higher values reduce recomputation overhead

How to reason about it

Use suppressed mode when the question is not only "are we over the line?" but also "how aggressively should we shed right now?"

That makes it useful for systems where:

  • some degraded throughput is better than a full stop
  • load can spike suddenly
  • downstream systems benefit from a smoother decline in admitted traffic

A practical mental model

  • 0.0 means healthy
  • mid-range values mean the key is hot and admission is becoming selective
  • values near 1.0 mean the key is heavily overloaded and most traffic should be shed

If your application cannot tolerate probabilistic outcomes, suppressed mode is the wrong tool. Use absolute limiting instead.