Why Rate Limiting Matters
Without rate limiting, your API is vulnerable to:
- Brute force attacks: Millions of password guesses
- Credential stuffing: Testing stolen credentials
- Data scraping: Extracting your entire database
- Resource exhaustion: Expensive queries overloading servers
- Cost exploitation: Running up your API/compute bills
Rate Limiting Strategies
1. Fixed Window
Limit: 100 requests per minute
Window: 00:00 to 00:59, 01:00 to 01:59, ...00:00 - 00:30: 90 requests ← OK
00:30 - 00:59: 10 requests ← Hits limit
00:59: Request blocked
01:00: Counter resets, allowed again
Pros: Simple, memory-efficient Cons: Burst at window boundaries
2. Sliding Window
Limit: 100 requests per minute
Check: Last 60 seconds from current time00:30: Check 99:30 - 00:30, 50 requests → Allowed
00:45: Check 99:45 - 00:45, 95 requests → Allowed
00:50: Check 99:50 - 00:50, 100 requests → Blocked
Pros: Smoother rate limiting Cons: More memory/computation
3. Token Bucket
Bucket: 10 tokens
Refill: 1 token per secondRequest 1: 9 tokens remaining
Request 2: 8 tokens remaining
...
Request 10: 0 tokens, wait for refill
(1 second passes): 1 token available
Pros: Allows bursts, smooth average rate Cons: More complex implementation
Implementing Rate Limiting
Next.js with Upstash
npm install @upstash/ratelimit @upstash/redis// lib/ratelimit.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'export const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, '10 s'),
analytics: true,
})
// middleware.ts
import { ratelimit } from '@/lib/ratelimit'
import { NextResponse } from 'next/server'
export async function middleware(request) {
const ip = request.ip ?? '127.0.0.1'
const { success, limit, reset, remaining } = await ratelimit.limit(ip)
if (!success) {
return new NextResponse('Too Many Requests', {
status: 429,
headers: {
'X-RateLimit-Limit': limit.toString(),
'X-RateLimit-Remaining': remaining.toString(),
'X-RateLimit-Reset': reset.toString(),
},
})
}
return NextResponse.next()
}
export const config = {
matcher: '/api/:path*',
}
Express with express-rate-limit
import rateLimit from 'express-rate-limit'const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // Limit each IP to 100 requests per window
standardHeaders: true,
legacyHeaders: false,
message: { error: 'Too many requests, please try again later.' },
})
app.use('/api/', limiter)
// Stricter limit for auth endpoints
const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 5, // Only 5 login attempts per 15 minutes
skipSuccessfulRequests: true, // Don't count successful logins
})
app.use('/api/auth/', authLimiter)
In-Memory (Development/Simple Cases)
// Simple in-memory rate limiter
const requests = new Map()function rateLimit(key, limit, windowMs) {
const now = Date.now()
const windowStart = now - windowMs
// Get or initialize request log
let log = requests.get(key) || []
// Remove old entries
log = log.filter(timestamp => timestamp > windowStart)
// Check limit
if (log.length >= limit) {
return { allowed: false, remaining: 0 }
}
// Add current request
log.push(now)
requests.set(key, log)
return { allowed: true, remaining: limit - log.length }
}
// Usage
export async function POST(req) {
const ip = req.headers.get('x-forwarded-for') || 'unknown'
const { allowed, remaining } = rateLimit(ip, 10, 60000)
if (!allowed) {
return Response.json({ error: 'Rate limit exceeded' }, { status: 429 })
}
// Process request
}
Endpoint-Specific Limits
Different endpoints need different limits:
const limits = {
// Authentication - very strict
'/api/auth/login': { requests: 5, window: '15m' },
'/api/auth/register': { requests: 3, window: '1h' },
'/api/auth/reset-password': { requests: 3, window: '1h' }, // Standard API - moderate
'/api/users': { requests: 100, window: '1m' },
'/api/posts': { requests: 100, window: '1m' },
// Expensive operations - strict
'/api/export': { requests: 5, window: '1h' },
'/api/search': { requests: 30, window: '1m' },
'/api/ai/generate': { requests: 10, window: '1m' },
// Public endpoints - lenient
'/api/health': { requests: 1000, window: '1m' },
}
Rate Limiting by User vs IP
IP-Based (Default)
const identifier = request.ipGood for: Unauthenticated endpoints, login pages Problem: Shared IPs (offices, schools) get rate limited together
User-Based
const session = await getServerSession()
const identifier = session?.user.id || request.ipGood for: Authenticated endpoints, per-user quotas Problem: Attackers can create multiple accounts
Combined
// Separate limits for IP and user
const ipLimit = await ratelimit.limit(ip:${ip})
const userLimit = session
? await ratelimit.limit(user:${session.user.id})
: { success: true }if (!ipLimit.success || !userLimit.success) {
return new Response('Rate limited', { status: 429 })
}
Response Headers
Always include rate limit headers:
return new Response(data, {
headers: {
'X-RateLimit-Limit': '100',
'X-RateLimit-Remaining': '95',
'X-RateLimit-Reset': '1640000000',
'Retry-After': '60', // Seconds until reset
},
})Handling Rate Limit Errors
Client-Side
async function fetchWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options) if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After') || 60
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
continue
}
return response
}
throw new Error('Rate limit exceeded after retries')
}
Server-Side Error Response
if (!rateLimitResult.success) {
return Response.json(
{
error: 'Rate limit exceeded',
message: 'Too many requests. Please try again later.',
retryAfter: Math.ceil((rateLimitResult.reset - Date.now()) / 1000),
},
{
status: 429,
headers: {
'Retry-After': Math.ceil((rateLimitResult.reset - Date.now()) / 1000).toString(),
},
}
)
}Rate Limiting Checklist
CRITICAL ENDPOINTS
==================
[ ] Login: 5 attempts per 15 minutes
[ ] Registration: 3 per hour
[ ] Password reset: 3 per hour
[ ] 2FA verification: 5 per 15 minutesSTANDARD API
============
[ ] Default limit on all endpoints
[ ] Per-endpoint customization where needed
[ ] Both IP and user-based limits
EXPENSIVE OPERATIONS
====================
[ ] Export/download: Strict limits
[ ] Search: Moderate limits
[ ] AI/LLM calls: Per-user quotas
IMPLEMENTATION
==============
[ ] Rate limit headers in responses
[ ] Proper 429 status code
[ ] Retry-After header
[ ] Client-side retry logic
The Bottom Line
Rate limiting is insurance against abuse. Every API needs it. AI never adds it.
No rate limiting = unlimited attack attempts. Add limits on day one.