Skip to content

High Availability

SMG supports high-availability cluster deployments using mesh networking for fault tolerance, scalability, and zero-downtime updates.


Overview

Fault Tolerance

Continue serving requests when individual router nodes fail. Automatic failover with zero manual intervention.

Scalability

Distribute load across multiple router instances. Add nodes without downtime.

State Synchronization

Share worker states, policy configurations, and rate limits across the cluster in real-time.

Zero Downtime Updates

Perform rolling updates without service interruption. Graceful shutdown with request draining.


Mesh Architecture

SMG Mesh Architecture

Gossip Protocol

SWIM-based protocol for membership and failure detection.

  • 1-second heartbeat interval
  • Automatic peer discovery
  • Failure detection in seconds

Cluster Coordination

Node coordination for cluster operations.

  • Membership tracking
  • Node status management
  • Graceful shutdown coordination

CRDT Stores

Conflict-free Replicated Data Types for eventual consistency.

  • No coordination locks
  • Partition tolerant
  • Automatic conflict resolution

State Replication

Real-time synchronization of all cluster state.

  • Worker registry
  • Rate limit counters
  • Cache-aware routing trees

Configuration

Command Line Options

Flag Default Description
--enable-mesh false Enable mesh networking for HA deployments
--mesh-server-name (auto) Unique identifier for this node in the cluster
--mesh-host 0.0.0.0 Host address for mesh communication
--mesh-port 39527 Port for mesh gRPC communication
--mesh-peer-urls (none) Initial peer URLs for cluster bootstrap

Basic Configuration

Node 1 (Bootstrap)

smg --enable-mesh \
    --mesh-server-name node1 \
    --mesh-host 0.0.0.0 \
    --mesh-port 39527 \
    --host 0.0.0.0 \
    --port 8000

Node 2 (Join)

smg --enable-mesh \
    --mesh-server-name node2 \
    --mesh-port 39527 \
    --mesh-peer-urls "node1:39527" \
    --host 0.0.0.0 \
    --port 8000

Node 3 (Join)

smg --enable-mesh \
    --mesh-server-name node3 \
    --mesh-port 39527 \
    --mesh-peer-urls "node1:39527,node2:39527" \
    --host 0.0.0.0 \
    --port 8000

Environment Variables

export SMG_ENABLE_MESH=true
export SMG_MESH_SERVER_NAME=node1
export SMG_MESH_HOST=0.0.0.0
export SMG_MESH_PORT=39527
export SMG_MESH_PEER_URLS="node1:39527,node2:39527"

Gossip Protocol

State Synchronization

SMG uses a SWIM-based gossip protocol for cluster membership and state propagation:

  1. Ping/Ping-Req: Each node periodically pings random peers to check health
  2. State Sync: Healthy nodes exchange state information during pings
  3. Failure Detection: Unreachable nodes are marked as suspected, then down
  4. Broadcast: Status changes are broadcast to all cluster members

Node Status States

Status Description
INIT Node is starting up
ALIVE Node is healthy and reachable
SUSPECTED Node may be unreachable (failed ping)
DOWN Node confirmed unreachable (failed ping-req)
LEAVING Node is gracefully shutting down

Failure Detection Timing

Phase Duration Action
Ping 1s interval Direct probe to peer
Down After missed pings Remove from active cluster

State Synchronization

Synchronized State Types

Worker Registry

All nodes share worker discovery and health status.

  • Worker URLs and metadata
  • Health check results
  • Circuit breaker states

Rate Limits

Cluster-wide rate limiting coordination.

  • Token bucket state
  • Request counters
  • Quota synchronization

Routing Trees

Cache-aware routing state shared across nodes.

  • Radix tree operations
  • Prefix match data
  • LRU eviction coordination

Policy State

Routing policy configuration and state.

  • Policy parameters
  • Load balancing weights
  • Session affinity mappings

CRDT Implementation

SMG uses several CRDT types for conflict-free synchronization:

CRDT Type Used For Merge Strategy
G-Counter Request counts Sum of all increments
PN-Counter Token buckets Sum of positive and negative
LWW-Register Worker state Last-writer-wins by timestamp
OR-Set Worker sets Union with tombstones

Deployment Patterns

Three-Node Cluster (Minimum HA)

Characteristics

  • Tolerates 1 node failure
  • Quorum of 2 for leader election
  • Recommended for most deployments

Configuration

# All nodes
smg --enable-mesh \
    --mesh-peer-urls "node1:39527,node2:39527,node3:39527" \
    --worker-urls http://worker1:8000,http://worker2:8000

Five-Node Cluster (Higher Availability)

Characteristics

  • Tolerates 2 node failures
  • Quorum of 3 for leader election
  • Suitable for critical workloads

Configuration

# All nodes
smg --enable-mesh \
    --mesh-peer-urls "node1:39527,node2:39527,node3:39527,node4:39527,node5:39527" \
    --worker-urls http://worker1:8000,http://worker2:8000

Kubernetes Deployment

StatefulSet Configuration

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: smg
spec:
  serviceName: smg-mesh
  replicas: 3
  selector:
    matchLabels:
      app: smg
  template:
    metadata:
      labels:
        app: smg
    spec:
      containers:
      - name: smg
        image: ghcr.io/lightseekorg/smg:latest
        args:
        - --enable-mesh
        - --mesh-server-name=$(POD_NAME)
        - --mesh-host=0.0.0.0
        - --mesh-port=39527
        - --mesh-peer-urls=smg-0.smg-mesh:39527,smg-1.smg-mesh:39527,smg-2.smg-mesh:39527
        - --worker-urls=$(WORKER_URLS)
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        ports:
        - containerPort: 8000
          name: http
        - containerPort: 39527
          name: mesh

Headless Service

apiVersion: v1
kind: Service
metadata:
  name: smg-mesh
spec:
  clusterIP: None
  selector:
    app: smg
  ports:
  - port: 39527
    name: mesh

HA Management API

Health Endpoints

Endpoint Method Description
/ha/health GET Node health status
/ha/status GET Cluster status information
/ha/workers GET Worker states across cluster
/ha/policies GET Policy states across cluster
/ha/shutdown POST Graceful shutdown trigger

Cluster Status Response

{
  "node_name": "node1",
  "node_count": 3,
  "nodes": [
    {"name": "node1", "status": "ALIVE", "address": "node1:39527"},
    {"name": "node2", "status": "ALIVE", "address": "node2:39527"},
    {"name": "node3", "status": "ALIVE", "address": "node3:39527"}
  ],
  "stores": {
    "workers": {"entry_count": 5, "last_sync": "2024-01-15T10:30:00Z"},
    "policies": {"entry_count": 2, "last_sync": "2024-01-15T10:30:00Z"}
  }
}

Monitoring

Mesh Metrics

Metric Description
smg_mesh_peers_total Number of connected peers
smg_mesh_peer_status Status of each peer (1=alive, 0=down)
smg_mesh_sync_operations_total State sync operations by type
smg_mesh_sync_latency_seconds State sync latency histogram
smg_mesh_leader_elections_total Leader election events
smg_mesh_gossip_messages_total Gossip messages sent/received

Alerting Rules

groups:
- name: smg-mesh
  rules:
  - alert: SMGClusterDegraded
    expr: smg_mesh_peers_total < 2
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "SMG cluster has fewer than 3 nodes"

  - alert: SMGNodeDown
    expr: smg_mesh_peer_status == 0
    for: 30s
    labels:
      severity: critical
    annotations:
      summary: "SMG mesh node {{ $labels.peer }} is down"

Best Practices

Odd Node Counts

Use 3, 5, or 7 nodes to avoid split-brain scenarios during network partitions.

Availability Zones

Distribute nodes across availability zones for resilience against zone failures.

Network Latency

Keep mesh nodes in the same region (< 10ms RTT) for optimal state sync performance.

Monitoring

Monitor smg_mesh_peers_total and alert when cluster size drops below threshold.


Troubleshooting

Common Issues

Symptom Cause Solution
Node stuck in INIT Cannot reach peers Check firewall rules for mesh port
Frequent leader elections Network instability Increase gossip timeouts
State inconsistency Clock skew Synchronize NTP across nodes
High sync latency Large state Increase sync interval

Debug Logging

RUST_LOG=smg::mesh=debug smg --enable-mesh ...

Verify Cluster Health

# Check cluster status
curl http://node1:8000/ha/status | jq

# Check individual node health
curl http://node1:8000/ha/health | jq

# Check worker states
curl http://node1:8000/ha/workers | jq

# Check policy states
curl http://node1:8000/ha/policies | jq

What's Next?

Graceful Shutdown

Allow in-flight requests to complete during shutdown.

Graceful Shutdown →

Circuit Breakers

Isolate failing workers to prevent cascade failures.

Circuit Breakers →

Metrics Reference

Complete list of mesh networking metrics.

Metrics Reference →