Concepts

Sliding Windows & Coalescing

How usage is tracked over time.

Trypema uses a sliding window: admission decisions consider activity in the last window_size_seconds.

To reduce overhead, increments are coalesced into buckets using rate_group_size_ms. Larger values reduce bucket count (faster, smaller) but make timing metadata coarser.

Both options are validated newtypes:

use trypema::{RateGroupSizeMs, WindowSizeSeconds};

let window_size_seconds = WindowSizeSeconds::try_from(60).unwrap();
let rate_group_size_ms = RateGroupSizeMs::try_from(10).unwrap();

// Validation
assert!(WindowSizeSeconds::try_from(0).is_err()); // must be >= 1
assert!(RateGroupSizeMs::try_from(0).is_err());   // must be >= 1

Sliding window (what it means)

Unlike fixed windows (which reset at boundaries), a sliding window continuously looks back over the last N seconds. This avoids boundary artifacts like "double-dipping" across minute changes.

At any moment, the limiter effectively asks:

"How many requests have we recorded for this key in the last window_size_seconds?"

Bucket coalescing

Requests are not stored individually. Instead, they are recorded into time buckets.

rate_group_size_ms controls how aggressively near-by increments are merged:

Small values (1-10ms) produce more buckets and more accurate timing hints; large values (50-100ms) reduce overhead and memory at the cost of coarser timing.

This trades precision for performance and memory.

How this affects `retry_after_ms`

When a request is rejected, Trypema returns backoff metadata. If coalescing is aggressive, "oldest bucket" timing becomes coarser, so retry_after_ms is less precise.

Worked example

10 second window, 0.5 req/s, and 200ms coalescing:

use trypema::{RateGroupSizeMs, RateLimit, WindowSizeSeconds};

let window_size_seconds = WindowSizeSeconds::try_from(10).unwrap();
let rate_limit = RateLimit::try_from(0.5).unwrap();
let rate_group_size_ms = RateGroupSizeMs::try_from(200).unwrap();

// Capacity over the sliding window is window_size_seconds * rate_limit
// 10s * 0.5 req/s = 5 requests in the window
let _ = (window_size_seconds, rate_limit, rate_group_size_ms);

Recommended starting point

Start with window_size_seconds = 60 and rate_group_size_ms = 10, then tune based on whether you need more precision or less overhead.

Rate Limits

Requests per second, stored as f64.

Decisions