Distributed Rate Limiter as a Service

A production-grade, distributed rate limiting service built with Spring Boot and Redis. Supports three algorithms, dynamic per-client configuration, atomic Redis Lua scripts, and real-time observability via Prometheus and Grafana.

Live Demo: https://rate-limiter-service-y5d7.onrender.com/swagger-ui/index.html
Docker Hub: https://hub.docker.com/repository/docker/bhaveshlohana/rate-limiter-service

Overview

Rate limiting is a critical component of any production API — it protects services from abuse, ensures fair usage across clients, and prevents cascading failures under high load. This project implements rate limiting as a standalone service that any backend application can integrate with via REST API or as a Spring Boot Starter dependency.

Key features:

Three rate limiting algorithms — Fixed Window, Sliding Window Log, Token Bucket
Atomic Redis Lua scripts — eliminates race conditions under concurrent load
Dynamic configuration — change limits per client type without restarting
Default config fallback — unknown client types fall back to a DEFAULT policy
Admin API — manage configs and inspect client state at runtime
Real-time observability — Prometheus metrics + Grafana dashboards
Plug and play — use as a REST service or embed via @RateLimit annotation as a Spring Boot Starter

Project Structure

rate-limiter/
├── rate-limiter-core/                 ← shared algorithms, models, Redis logic
├── rate-limiter-service/              ← standalone REST service
└── rate-limiter-spring-boot-starter/  ← embeddable Spring Boot library

rate-limiter-core is a shared library consumed by both the service and the starter — no logic duplication.

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Your Application                     │
│                                                         │
│   Mode 1: POST /api/rate-limiter/check                  │
│   Mode 2: @RateLimit(clientType = "PREMIUM")            │
│           via Spring Boot Starter                       │
└─────────────────┬───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│              Rate Limiter Service                       │
│                                                         │
│  RateLimiterController                                  │
│         │                                               │
│         ▼                                               │
│  RateLimiterFactory ──── ClientConfigService            │
│         │                       │                       │
│         ▼                       ▼                       │
│  ┌─────────────┐         ┌─────────────┐                │
│  │  Algorithm  │        │   Config   │               │
│  │  Selection  │         │   Lookup    │                │
│  └─────────────┘         └─────────────┘                │
│         │                                               │
│         ▼                                               │
│  ┌──────────────────────────────────┐                   │
│  │         Redis Lua Script         │                   │
│  │    (atomic read-check-write)     │                   │
│  └──────────────────────────────────┘                   │
└─────────────────┬───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│                      Redis                              │
│                                                         │
│  ratelimit:config:PREMIUM    ← client configs           │
│  ratelimit:fixed:user123     ← algorithm state          │
│  ratelimit:token:user456     ← algorithm state          │
└─────────────────────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│              Observability Stack                        │
│                                                         │
│  /actuator/prometheus ──► Prometheus ──► Grafana        │
└─────────────────────────────────────────────────────────┘

Algorithms

Comparison

Algorithm	Memory	Accuracy	Burst Handling	Best For
Fixed Window	Low	Low (boundary burst)	Allows boundary burst	Simple APIs, low traffic
Sliding Window Log	High	High	Smooth, no bursts	Strict rate limiting
Token Bucket	Low	High	Controlled burst	Most production use cases

Fixed Window

Divides time into fixed buckets. Counts requests per bucket. Resets when the window expires.

Redis structure: STRING — integer counter with TTL
Known limitation: Boundary burst — a client can make 2x requests at window boundaries

Sliding Window Log

Stores a timestamp log of every request in a Sorted Set. On each request, evicts entries older than the window and counts what remains.

Redis structure: ZSET — members are UUIDs, scores are timestamps
Known limitation: Memory heavy for high-traffic clients

Token Bucket

Each client has a bucket that refills at a fixed rate. Each request consumes one token. Allows bursts up to bucket capacity while enforcing an average rate.

Redis structure: HASH — tokens and lastRefillTime
Best for: Most real-world rate limiting scenarios

Atomicity

All three algorithms use Redis Lua scripts for atomic execution. The read-check-write cycle executes as a single Redis operation, eliminating race conditions under concurrent load. Both naive (non-atomic) and atomic implementations are available for comparison.

Getting Started

Prerequisites

Docker and Docker Compose
Java 21 (for local development)

Run with Docker Compose

git clone https://github.com/bhaveshlohana/rate-limiter-service
cd rate-limiter-service
docker compose up -d

This starts:

Rate Limiter Service on http://localhost:8080
Redis on localhost:6379
Prometheus on http://localhost:9090
Grafana on http://localhost:3000 (admin/admin)

Run with Docker

docker run -p 8080:8080 \
  -e SPRING_DATA_REDIS_HOST=host.docker.internal \
  -e SPRING_DATA_REDIS_PORT=6379 \
  bhaveshlohana/rate-limiter-service:latest

Run locally

./mvnw spring-boot:run

Requires Redis running on localhost:6379.

API Reference

Rate Limit Check

POST /api/rate-limiter/check

{
  "clientId": "user123",
  "clientType": "PREMIUM"
}

Responses:

200 OK — request allowed
429 Too Many Requests — rate limit exceeded

{
  "allowed": true,
  "reason": "Request allowed",
  "remainingRequests": 47
}

Admin — Set Config

POST /api/admin/config

{
  "clientType": "PREMIUM",
  "algorithm": "TOKEN_BUCKET",
  "capacity": 500,
  "refillRatePerSecond": 10.0
}

Admin — Get Config

GET /api/admin/config/{clientType}

Admin — List All Configs

GET /api/admin/config

Admin — Delete Config

DELETE /api/admin/config/{clientType}

Admin — Client Status

GET /api/admin/status?clientId=user123&clientType=PREMIUM

{
  "clientId": "user123",
  "clientType": "PREMIUM",
  "algorithm": "TOKEN_BUCKET",
  "currentTokens": 487.5,
  "remainingRequests": 487
}

Configuration

Client configurations are stored dynamically in Redis. No restart required to update limits.

Configuration Fields

Field	Type	Required For	Description
`clientType`	String	All	Identifier for the client type
`algorithm`	Enum	All	`FIXED_WINDOW`, `SLIDING_WINDOW`, `TOKEN_BUCKET`
`limit`	Integer	Fixed/Sliding Window	Max requests per window
`windowSizeSeconds`	Integer	Fixed/Sliding Window	Window duration in seconds
`capacity`	Integer	Token Bucket	Max bucket size (burst limit)
`refillRatePerSecond`	Double	Token Bucket	Tokens added per second

Default Config

A DEFAULT config is seeded on startup and applies to any unknown client type:

{
  "clientType": "DEFAULT",
  "algorithm": "FIXED_WINDOW",
  "limit": 10,
  "windowSizeSeconds": 60
}

Example Configs

# Anonymous users — strict
curl -X POST http://localhost:8080/api/admin/config \
  -H "Content-Type: application/json" \
  -d '{
    "clientType": "ANONYMOUS",
    "algorithm": "FIXED_WINDOW",
    "limit": 10,
    "windowSizeSeconds": 60
  }'

# Registered users — moderate
curl -X POST http://localhost:8080/api/admin/config \
  -H "Content-Type: application/json" \
  -d '{
    "clientType": "REGISTERED",
    "algorithm": "SLIDING_WINDOW",
    "limit": 100,
    "windowSizeSeconds": 60
  }'

# Premium users — generous burst
curl -X POST http://localhost:8080/api/admin/config \
  -H "Content-Type: application/json" \
  -d '{
    "clientType": "PREMIUM",
    "algorithm": "TOKEN_BUCKET",
    "capacity": 500,
    "refillRatePerSecond": 10.0
  }'

Plug and Play

Mode 1 — REST Service

Any service can integrate by calling the /check endpoint before processing a request:

RestTemplate restTemplate = new RestTemplate();
RateLimitRequest request = new RateLimitRequest("user123", "PREMIUM");
ResponseEntity<RateLimitResponse> response = restTemplate.postForEntity(
    "http://rate-limiter-service/api/rate-limiter/check",
    request,
    RateLimitResponse.class
);

if (response.getStatusCode() == HttpStatus.TOO_MANY_REQUESTS) {
    throw new RateLimitExceededException();
}

Mode 2 — Spring Boot Starter

Add the dependency to your Spring Boot project:

<dependency>
    <groupId>com.bhavesh.learn</groupId>
    <artifactId>rate-limiter-spring-boot-starter</artifactId>
    <version>1.0.0</version>
</dependency>

Configure client types in application.yml:

rate-limiter:
  configs:
    - clientType: DEFAULT
      algorithm: FIXED_WINDOW
      limit: 60
      windowSizeSeconds: 60
    - clientType: PREMIUM
      algorithm: TOKEN_BUCKET
      capacity: 100
      refillRatePerSecond: 10.0

Annotate your endpoints:

@RateLimit(clientType = "PREMIUM")
@GetMapping("/api/data")
public ResponseEntity<?> getData() {
    return ResponseEntity.ok(data);
}

When the rate limit is exceeded, the starter automatically returns 429 Too Many Requests — no additional configuration needed.

Observability

Metrics

Metrics are exposed at /actuator/prometheus and scraped by Prometheus every 5 seconds.

Metric	Labels	Description
`ratelimit_request_total`	`clientType`, `algorithm`, `result`	Total requests checked

Key queries:

# Request rate per second
rate(ratelimit_request_total[1m])

# Rejection rate by client type
rate(ratelimit_request_total{result="rejected"}[1m])

# Allowed vs rejected
ratelimit_request_total

Grafana Dashboard

Import the dashboard from grafana/dashboard.json or connect Grafana to your Prometheus instance.

Benchmark Results

Load tested with k6 — 10 concurrent users per algorithm, 30 seconds each. Test run twice for consistency.

Metric	Run 1	Run 2
Total requests	8,550	8,620
Throughput	~85 req/sec	~86 req/sec
Rejection rate	90.17%	90.25%
Avg response time	4.29ms	3.49ms
p(95) response time	7.8ms	5.26ms
Max response time	27.99ms	26.95ms
All checks passed	✅ 100%	✅ 100%

Results were consistent across both runs. All responses returned within 200ms under load with no errors or timeouts across all three algorithms.

Design Decisions

Why Redis for config storage?
Configs and rate limit state share the same Redis instance — no extra infrastructure. Config changes are reflected immediately without restarts.

Why Lua scripts for atomicity?
Redis executes Lua scripts atomically — the entire read-check-write cycle runs as a single operation. This eliminates the race condition where two concurrent requests both read the same counter value and both get allowed when only one slot remains. Both naive and atomic implementations are provided for comparison.

Why fail closed on missing config?
If a client type has no config and no DEFAULT exists, requests are rejected. A rate limiter is a security boundary — unknown clients should not get unlimited access by default.

Why Token Bucket for most use cases?
Fixed Window allows boundary bursts. Sliding Window is memory-heavy at scale. Token Bucket provides accurate rate limiting with controlled burst support at O(1) memory per client.

Why a Spring Boot Starter?
The starter allows any Spring Boot application to add rate limiting with a single dependency and annotation — no REST calls, no manual wiring. It auto-configures all beans and seeds client configs from application.yml on startup.

Known Limitations

KEYS * used in getAllConfigs() — blocks Redis on large keyspaces. Production replacement: use SCAN for incremental iteration.
No authentication on admin endpoints — add Spring Security before production use.
Render free tier cold starts — app spins down after 15 minutes of inactivity, causing ~30s delay on first request.
Single Redis instance — no Redis Cluster support. For high availability, configure Redis Sentinel or Cluster.

Tech Stack

Java 21 + Spring Boot 3.4
Redis — rate limit state and config storage
Lua Scripts — atomic Redis operations
Prometheus + Grafana — observability
Docker + Docker Compose — containerization
GitHub Actions — CI/CD
Render — cloud deployment

Running Tests

# run all tests across all modules
./mvnw test

# run tests for a specific module
./mvnw test -pl rate-limiter-service
./mvnw test -pl rate-limiter-core

Tests use embedded Redis — no external dependencies required.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
.mvn/wrapper		.mvn/wrapper
docs/images		docs/images
grafana		grafana
k6		k6
rate-limiter-core		rate-limiter-core
rate-limiter-service		rate-limiter-service
rate-limiter-spring-boot-starter		rate-limiter-spring-boot-starter
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
prometheus.yml		prometheus.yml

Folders and files

Latest commit

History

Repository files navigation

Distributed Rate Limiter as a Service

Overview

Project Structure

Architecture

Algorithms

Comparison

Fixed Window

Sliding Window Log

Token Bucket

Atomicity

Getting Started

Prerequisites

Run with Docker Compose

Run with Docker

Run locally

API Reference

Rate Limit Check

Admin — Set Config

Admin — Get Config

Admin — List All Configs

Admin — Delete Config

Admin — Client Status

Configuration

Configuration Fields

Default Config

Example Configs

Plug and Play

Mode 1 — REST Service

Mode 2 — Spring Boot Starter

Observability

Metrics

Grafana Dashboard

Benchmark Results

Design Decisions

Known Limitations

Tech Stack

Running Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages