Skip to content

Reliability Controls

This guide provides command-first setup for request protection and failure handling: concurrency limits, retries, and circuit breakers.

Before you begin

  • Completed the Getting Started guide
  • Two or more workers recommended for retry failover

1. Concurrency and Queue Limits

Start with bounded concurrency and queueing:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --max-concurrent-requests 100 \
  --queue-size 200 \
  --queue-timeout-secs 30

Optional token refill rate:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --max-concurrent-requests 100 \
  --rate-limit-tokens-per-second 100 \
  --queue-size 200 \
  --queue-timeout-secs 30

2. Retries

Enable retries with explicit backoff settings:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --retry-max-retries 5 \
  --retry-initial-backoff-ms 50 \
  --retry-max-backoff-ms 30000 \
  --retry-backoff-multiplier 1.5 \
  --retry-jitter-factor 0.2

Disable retries when client handles them:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --disable-retries

3. Circuit Breakers

Protect traffic from repeatedly failing workers:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --cb-failure-threshold 10 \
  --cb-success-threshold 3 \
  --cb-timeout-duration-secs 60 \
  --cb-window-duration-secs 120

Disable only for controlled testing:

smg \
  --worker-urls http://w1:8000 http://w2:8000 \
  --disable-circuit-breaker

Production Baseline

A practical starting profile:

smg \
  --worker-urls http://w1:8000 http://w2:8000 http://w3:8000 \
  --max-concurrent-requests 150 \
  --queue-size 300 \
  --queue-timeout-secs 30 \
  --retry-max-retries 3 \
  --retry-initial-backoff-ms 50 \
  --retry-max-backoff-ms 5000 \
  --retry-backoff-multiplier 2.0 \
  --retry-jitter-factor 0.2 \
  --cb-failure-threshold 10 \
  --cb-success-threshold 3 \
  --cb-timeout-duration-secs 60 \
  --cb-window-duration-secs 120

Verify

curl http://localhost:30000/health
curl http://localhost:30000/workers

With metrics enabled, inspect reliability metrics at /metrics.


Next Steps