Strategies

Suppressed

Probabilistic suppression near capacity.

Suppressed is designed for graceful degradation under load.

This strategy is inspired by Ably's post on distributed rate limiting at scale: https://ably.com/blog/distributed-rate-limiting-scale-your-platform

It tracks two series:

An observed series (all calls) and an accepted series (only admitted calls).

When accepted usage reaches capacity, the limiter starts suppressing a fraction of requests.

Suppressed returns RateLimitDecision::Suppressed. Always gate your request on is_allowed.

Decision contract

Suppressed returns one of these decisions:

  • Allowed: request admitted
  • Suppressed { suppression_factor, is_allowed }: suppression logic is active; is_allowed is the admission signal for this call
  • Rejected { window_size_seconds, retry_after_ms, remaining_after_waiting }: denied (hard cutoff), with backoff hints

Do not treat "variant == Suppressed" as denial. Treat is_allowed as the boolean gate.

Operating regimes

Suppressed has three regimes for a given key:

1) Below capacity

When accepted usage is below window capacity, it returns Allowed and no suppression is applied.

2) At/near capacity

When accepted usage is at or above capacity, it returns Suppressed { suppression_factor, is_allowed }. The suppression_factor indicates how aggressively the strategy is suppressing (0.0 to 1.0), and is_allowed tells you whether this call is admitted.

3) Above hard limit

Once the key is far above the target, the strategy denies most or all calls.

Two outcomes are possible:

  • Many calls return Suppressed { is_allowed: false, ... } (full suppression).
  • If a call is probabilistically admitted, the accepted limiter still enforces a hard cutoff and may return Rejected { ... }.

hard_limit_factor controls where that cutoff is.

Two series per key

Suppressed maintains two sliding-window series:

  • Observed: every call increments this series. It represents load the system is seeing.
  • Accepted: only admitted calls increment this series. It represents load the system is choosing to serve.

Implementation detail:

  • Both local and Redis implementations record observed usage using RateLimit::max() so the observed series never rejects.
  • Accepted usage is recorded using a hard-limit rate (rate_limit * hard_limit_factor) so it can produce Rejected when the cutoff is exceeded.

Algorithm (step by step)

For each call inc(key, rate_limit, count):

  1. Record observed usage (always).
  2. Compute a suppression factor for the key (cached).
  3. Clamp the suppression factor to a maximum of 1.0.
  4. If the suppression factor is <= 0.0, bypass suppression and attempt to increment the accepted limiter.
  5. Otherwise, allow with probability 1.0 - suppression_factor.
  6. If denied by the probability check, return Suppressed { is_allowed: false, suppression_factor } and do not increment accepted usage.
  7. If allowed by the probability check, increment accepted usage with the hard-limit rate.
  8. If the accepted increment is allowed, return Suppressed { is_allowed: true, suppression_factor }.
  9. If the accepted increment is rejected, return Rejected { ... }.

Suppression factor intuition

At a high level, suppression increases as perceived rate rises above the target.

Perceived rate is the maximum of:

the average rate across the whole window and the rate observed in the last 1000ms.

This makes the strategy react faster to short spikes.

Suppression factor (exact)

The suppression factor is derived from the observed series:

average_rate_in_window = observed_total_in_window / window_size_seconds
rate_in_last_1s = observed_total_in_last_1000ms
perceived_rate = max(average_rate_in_window, rate_in_last_1s)

suppression_factor = 1 - (rate_limit / perceived_rate)

Notes:

  • If perceived_rate <= rate_limit, the suppression factor is <= 0.0 and suppression is bypassed.
  • The factor is clamped to at most 1.0.
  • Higher perceived rate -> higher suppression factor -> lower probability of admission.

Hard limit factor

Hard rejection is controlled by hard_limit_factor:

hard_limit = rate_limit * hard_limit_factor

Recommended starting point: hard_limit_factor = 1.5 (50% burst headroom).

This factor is applied by using the hard-limit rate when incrementing the accepted series.

Caching

Suppression factor is cached per key to avoid recomputing it on every call:

  • Local: cached in-memory and recomputed at most once per suppression_factor_cache_ms.
  • Redis: cached in Redis with a short TTL so multiple calls reuse the same factor.

Get the current suppression factor

Both providers expose a method to fetch the current suppression factor for a key (useful for metrics and debugging).

Local:

let suppression_factor = rl.local().suppressed().get_suppression_factor("user_123");
let _ = suppression_factor;

Redis:

use trypema::redis::RedisKey;

let key = RedisKey::try_from("user_123".to_string()).unwrap();
let suppression_factor = rl.redis().suppressed().get_suppression_factor(&key).await.unwrap();
let _ = suppression_factor;

Usage (Local)

Local suppressed is synchronous:

use std::sync::Arc;

use trypema::{HardLimitFactor, RateGroupSizeMs, RateLimit, RateLimitDecision, RateLimiter, RateLimiterOptions, SuppressionFactorCacheMs, WindowSizeSeconds};
use trypema::local::LocalRateLimiterOptions;

let rl = Arc::new(RateLimiter::new(RateLimiterOptions {
    local: LocalRateLimiterOptions {
        window_size_seconds: WindowSizeSeconds::try_from(60).unwrap(),
        rate_group_size_ms: RateGroupSizeMs::try_from(10).unwrap(),
        hard_limit_factor: HardLimitFactor::try_from(1.5).unwrap(),
        suppression_factor_cache_ms: SuppressionFactorCacheMs::default(),
    },
}));

rl.run_cleanup_loop();

let rate = RateLimit::try_from(10.0).unwrap();
let decision = rl.local().suppressed().inc("user_123", &rate, 1);

match decision {
    RateLimitDecision::Allowed => {
        // proceed normally
    }
    RateLimitDecision::Suppressed {
        suppression_factor,
        is_allowed: true,
    } => {
        let _ = suppression_factor;
        // admitted while suppression is active (often degrade priority)
    }
    RateLimitDecision::Suppressed {
        suppression_factor,
        is_allowed: false,
    } => {
        let _ = suppression_factor;
        // denied by suppression (treat like 429)
    }
    RateLimitDecision::Rejected {
        window_size_seconds,
        retry_after_ms,
        remaining_after_waiting,
    } => {
        let _ = window_size_seconds;
        let _ = remaining_after_waiting;
        // denied by the hard cutoff
        let _ = retry_after_ms;
    }
}

Usage (Redis)

Redis suppressed is asynchronous and returns Result<RateLimitDecision, TrypemaError>:

use std::sync::Arc;

use trypema::{HardLimitFactor, RateGroupSizeMs, RateLimit, RateLimitDecision, RateLimiter, RateLimiterOptions, SuppressionFactorCacheMs, WindowSizeSeconds};
use trypema::local::LocalRateLimiterOptions;
use trypema::redis::{RedisKey, RedisRateLimiterOptions};

// Create Redis connection manager
let client = redis::Client::open("redis://127.0.0.1:6379/").unwrap();
let connection_manager = client.get_connection_manager().await.unwrap();

let rl = Arc::new(RateLimiter::new(RateLimiterOptions {
    local: LocalRateLimiterOptions {
        window_size_seconds: WindowSizeSeconds::try_from(60).unwrap(),
        rate_group_size_ms: RateGroupSizeMs::try_from(10).unwrap(),
        hard_limit_factor: HardLimitFactor::try_from(1.5).unwrap(),
        suppression_factor_cache_ms: SuppressionFactorCacheMs::default(),
    },
    redis: RedisRateLimiterOptions {
        connection_manager,
        prefix: None,
        window_size_seconds: WindowSizeSeconds::try_from(60).unwrap(),
        rate_group_size_ms: RateGroupSizeMs::try_from(10).unwrap(),
        hard_limit_factor: HardLimitFactor::try_from(1.5).unwrap(),
        suppression_factor_cache_ms: SuppressionFactorCacheMs::default(),
    },
}));

rl.run_cleanup_loop();

let key = RedisKey::try_from("user_123".to_string()).unwrap();
let rate = RateLimit::try_from(10.0).unwrap();
let decision = rl.redis().suppressed().inc(&key, &rate, 1).await.unwrap();

match decision {
    RateLimitDecision::Allowed => {
        // proceed normally
    }
    RateLimitDecision::Suppressed {
        suppression_factor,
        is_allowed: true,
    } => {
        let _ = suppression_factor;
        // admitted while suppression is active
    }
    RateLimitDecision::Suppressed {
        suppression_factor,
        is_allowed: false,
    } => {
        let _ = suppression_factor;
        // denied by suppression
    }
    RateLimitDecision::Rejected {
        window_size_seconds,
        retry_after_ms,
        remaining_after_waiting,
    } => {
        let _ = window_size_seconds;
        let _ = remaining_after_waiting;
        // denied by the hard cutoff
        let _ = retry_after_ms;
    }
}

Concurrency notes

  • Local suppressed is not a single atomic action across threads; under contention, multiple callers can temporarily exceed the target.
  • Redis operations run as atomic Lua scripts, but suppressed inc(...) is composed of multiple script executions (observed increment, suppression-factor computation, accepted increment).