System DesignFeatured

System Design Deep Dive: API Gateway Patterns at Scale

February 18, 202614 min read

An API gateway is the front door of your microservices architecture. Learn rate limiting, authentication offloading, request routing, and circuit breaking — with real implementation patterns.

System Design Deep Dive: API Gateway Patterns at Scale

Introduction

When you have more than one service behind your infrastructure, every client request has to somehow reach the right one. An API Gateway is the single entry point that handles routing, authentication, rate limiting, caching, logging, and protocol translation — so your individual services don't have to.

This guide walks through the most important API gateway patterns with concrete implementation examples.

What an API Gateway Does

Client → API Gateway → Service A
                     → Service B
                     → Service C

Responsibilities:

Request routing — route /users to user-service, /orders to order-service
Authentication & authorization — validate JWT before the request hits any service
Rate limiting — protect services from abuse
Load balancing — distribute requests across service instances
Circuit breaking — stop cascading failures
Request/response transformation — aggregate multiple service responses into one

Pattern 1: JWT Validation at the Gateway

Instead of each service validating tokens independently, centralise auth at the gateway:

// gateway/middleware/auth.ts (Express + jsonwebtoken)
import jwt from 'jsonwebtoken';
import type { Request, Response, NextFunction } from 'express';

const PUBLIC_ROUTES = ['/auth/login', '/auth/register', '/health'];

export function authMiddleware(req: Request, res: Response, next: NextFunction) {
  if (PUBLIC_ROUTES.includes(req.path)) return next();

  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token provided' });

  try {
    const payload = jwt.verify(token, process.env.JWT_SECRET!) as { userId: string; role: string };
    // Inject user context as headers for downstream services
    req.headers['x-user-id'] = payload.userId;
    req.headers['x-user-role'] = payload.role;
    next();
  } catch {
    return res.status(401).json({ error: 'Invalid token' });
  }
}

Downstream services can now trust the injected headers without doing their own JWT verification.

Pattern 2: Token Bucket Rate Limiting with Redis

// gateway/middleware/rateLimit.ts
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL!);

const WINDOW_SECONDS = 60;
const MAX_REQUESTS = 100;

export async function rateLimitMiddleware(req: Request, res: Response, next: NextFunction) {
  const key = `ratelimit:${req.headers['x-user-id'] ?? req.ip}`;

  const current = await redis.incr(key);
  if (current === 1) {
    await redis.expire(key, WINDOW_SECONDS);
  }

  res.setHeader('X-RateLimit-Limit', MAX_REQUESTS);
  res.setHeader('X-RateLimit-Remaining', Math.max(0, MAX_REQUESTS - current));

  if (current > MAX_REQUESTS) {
    return res.status(429).json({
      error: 'Too Many Requests',
      retryAfter: await redis.ttl(key),
    });
  }

  next();
}

This is a sliding window counter. For more precise rate limiting, use a sorted set (ZRANGEBYSCORE + ZADD pattern).

Pattern 3: Circuit Breaker

Prevent one failing service from taking down the entire system:

// gateway/lib/circuitBreaker.ts
type State = 'CLOSED' | 'OPEN' | 'HALF_OPEN';

class CircuitBreaker {
  private state: State = 'CLOSED';
  private failureCount = 0;
  private lastFailureTime = 0;

  constructor(
    private readonly threshold = 5,       // failures before opening
    private readonly recoveryTimeout = 30000 // ms before trying again
  ) {}

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime > this.recoveryTimeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit is OPEN — service unavailable');
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure();
      throw err;
    }
  }

  private onSuccess() {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }

  private onFailure() {
    this.failureCount++;
    this.lastFailureTime = Date.now();
    if (this.failureCount >= this.threshold) {
      this.state = 'OPEN';
      console.warn('Circuit OPENED — too many failures');
    }
  }
}

export const orderServiceBreaker = new CircuitBreaker(5, 30_000);

Pattern 4: Backend for Frontend (BFF)

Instead of one generic gateway, create specialised gateways per client type:

Mobile App  → Mobile BFF  → aggregates lightweight responses
Web App     → Web BFF     → fetches full page data in one call
Admin Panel → Admin BFF   → includes sensitive fields

// bff/web/routes/dashboard.ts
router.get('/dashboard', authMiddleware, async (req, res) => {
  const userId = req.headers['x-user-id'] as string;

  // Parallel fetch from multiple services
  const [user, orders, notifications] = await Promise.all([
    userService.getById(userId),
    orderService.getRecent(userId, 5),
    notificationService.getUnread(userId),
  ]);

  // Shape response for the web client
  res.json({
    user: { name: user.name, avatar: user.avatar },
    recentOrders: orders.map(o => ({ id: o.id, total: o.total, status: o.status })),
    unreadCount: notifications.length,
  });
});

Choosing the Right Tool

Tool	Best For
Nginx	Simple reverse proxy, static rate limiting
Kong	Plugin ecosystem, enterprise features
AWS API Gateway	Serverless, zero ops
custom Node/Go	Full control, BFF patterns
Traefik	Kubernetes-native, automatic service discovery

Conclusion

A well-designed API gateway is invisible to your users but critical to your architecture's resilience. Start with routing and auth, add rate limiting and circuit breaking as you scale, and consider BFF patterns when clients have divergent needs.

Related: See the post on microservices event-driven architecture for how to handle async communication between services.