System Design Deep Dive: API Gateway Patterns at Scale
Introduction
When you have more than one service behind your infrastructure, every client request has to somehow reach the right one. An API Gateway is the single entry point that handles routing, authentication, rate limiting, caching, logging, and protocol translation — so your individual services don't have to.
This guide walks through the most important API gateway patterns with concrete implementation examples.
What an API Gateway Does
Client → API Gateway → Service A
→ Service B
→ Service C
Responsibilities:
- Request routing — route /users to user-service, /orders to order-service
- Authentication & authorization — validate JWT before the request hits any service
- Rate limiting — protect services from abuse
- Load balancing — distribute requests across service instances
- Circuit breaking — stop cascading failures
- Request/response transformation — aggregate multiple service responses into one
Pattern 1: JWT Validation at the Gateway
Instead of each service validating tokens independently, centralise auth at the gateway:
// gateway/middleware/auth.ts (Express + jsonwebtoken)
import jwt from 'jsonwebtoken';
import type { Request, Response, NextFunction } from 'express';
const PUBLIC_ROUTES = ['/auth/login', '/auth/register', '/health'];
export function authMiddleware(req: Request, res: Response, next: NextFunction) {
if (PUBLIC_ROUTES.includes(req.path)) return next();
const token = req.headers.authorization?.split(' ')[1];
if (!token) return res.status(401).json({ error: 'No token provided' });
try {
const payload = jwt.verify(token, process.env.JWT_SECRET!) as { userId: string; role: string };
// Inject user context as headers for downstream services
req.headers['x-user-id'] = payload.userId;
req.headers['x-user-role'] = payload.role;
next();
} catch {
return res.status(401).json({ error: 'Invalid token' });
}
}
Downstream services can now trust the injected headers without doing their own JWT verification.
Pattern 2: Token Bucket Rate Limiting with Redis
// gateway/middleware/rateLimit.ts
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
const WINDOW_SECONDS = 60;
const MAX_REQUESTS = 100;
export async function rateLimitMiddleware(req: Request, res: Response, next: NextFunction) {
const key = `ratelimit:${req.headers['x-user-id'] ?? req.ip}`;
const current = await redis.incr(key);
if (current === 1) {
await redis.expire(key, WINDOW_SECONDS);
}
res.setHeader('X-RateLimit-Limit', MAX_REQUESTS);
res.setHeader('X-RateLimit-Remaining', Math.max(0, MAX_REQUESTS - current));
if (current > MAX_REQUESTS) {
return res.status(429).json({
error: 'Too Many Requests',
retryAfter: await redis.ttl(key),
});
}
next();
}
This is a sliding window counter. For more precise rate limiting, use a sorted set (ZRANGEBYSCORE + ZADD pattern).
Pattern 3: Circuit Breaker
Prevent one failing service from taking down the entire system:
// gateway/lib/circuitBreaker.ts
type State = 'CLOSED' | 'OPEN' | 'HALF_OPEN';
class CircuitBreaker {
private state: State = 'CLOSED';
private failureCount = 0;
private lastFailureTime = 0;
constructor(
private readonly threshold = 5, // failures before opening
private readonly recoveryTimeout = 30000 // ms before trying again
) {}
async call<T>(fn: () => Promise<T>): Promise<T> {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime > this.recoveryTimeout) {
this.state = 'HALF_OPEN';
} else {
throw new Error('Circuit is OPEN — service unavailable');
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (err) {
this.onFailure();
throw err;
}
}
private onSuccess() {
this.failureCount = 0;
this.state = 'CLOSED';
}
private onFailure() {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.threshold) {
this.state = 'OPEN';
console.warn('Circuit OPENED — too many failures');
}
}
}
export const orderServiceBreaker = new CircuitBreaker(5, 30_000);
Pattern 4: Backend for Frontend (BFF)
Instead of one generic gateway, create specialised gateways per client type:
Mobile App → Mobile BFF → aggregates lightweight responses
Web App → Web BFF → fetches full page data in one call
Admin Panel → Admin BFF → includes sensitive fields
// bff/web/routes/dashboard.ts
router.get('/dashboard', authMiddleware, async (req, res) => {
const userId = req.headers['x-user-id'] as string;
// Parallel fetch from multiple services
const [user, orders, notifications] = await Promise.all([
userService.getById(userId),
orderService.getRecent(userId, 5),
notificationService.getUnread(userId),
]);
// Shape response for the web client
res.json({
user: { name: user.name, avatar: user.avatar },
recentOrders: orders.map(o => ({ id: o.id, total: o.total, status: o.status })),
unreadCount: notifications.length,
});
});
Choosing the Right Tool
| Tool | Best For |
|---|---|
| Nginx | Simple reverse proxy, static rate limiting |
| Kong | Plugin ecosystem, enterprise features |
| AWS API Gateway | Serverless, zero ops |
| custom Node/Go | Full control, BFF patterns |
| Traefik | Kubernetes-native, automatic service discovery |
Conclusion
A well-designed API gateway is invisible to your users but critical to your architecture's resilience. Start with routing and auth, add rate limiting and circuit breaking as you scale, and consider BFF patterns when clients have divergent needs.
Related: See the post on microservices event-driven architecture for how to handle async communication between services.