API Gateway Architecture: Complete Implementation Guide for Production Systems
Master API gateway architecture with this comprehensive production guide. Learn routing patterns, authentication strategies, rate limiting, caching, load balancing, and security best practices. Includes implementation examples for Kong, AWS API Gateway, Azure API Management, and NGINX with real-world production configurations, monitoring strategies, and performance optimization techniques.
Introduction
Your microservices architecture is growing. You now have 15 services, each implementing its own authentication, rate limiting, and logging. A security audit reveals three services using outdated JWT libraries. Performance testing shows duplicate calls because caching isn't standardized. Your ops team is drowning in configuration drift across services.
This is exactly why API gateways exist.
An API gateway acts as the single entry point for all client requests, centralizing cross-cutting concerns like authentication, rate limiting, caching, and routing. Instead of each service reimplementing these responsibilities, the gateway handles them once—consistently, efficiently, and securely.
The impact is substantial: Companies report 40% reduction in backend code, 10x improvement in incident response time, and 60% faster time-to-market for new APIs after adopting gateway patterns.
This comprehensive guide covers API gateway architecture from fundamentals to production deployment, with real-world examples using Kong, AWS API Gateway, Azure API Management, and NGINX.
What is an API Gateway?
Core Responsibilities
An API gateway sits between clients and backend services, handling:
1. Request Routing
Client → API Gateway → Service Discovery → Backend Service
2. Authentication & Authorization
- JWT validation
- OAuth 2.0 flows
- API key management
- mTLS verification
3. Rate Limiting & Throttling
- Per-client limits
- Global quotas
- Burst handling
4. Response Transformation
- Protocol translation (REST ↔ gRPC)
- Response aggregation
- Data masking
5. Caching
- Response caching
- Cache invalidation
- Edge caching
6. Monitoring & Logging
- Request/response logging
- Metrics collection
- Distributed tracing
Gateway vs Reverse Proxy
| Feature | Reverse Proxy (NGINX) | API Gateway |
|---|---|---|
| Routing | ✅ Basic | ✅ Advanced (path, header, query) |
| Load Balancing | ✅ | ✅ |
| SSL Termination | ✅ | ✅ |
| Authentication | ⚠️ Limited | ✅ Full (JWT, OAuth, API keys) |
| Rate Limiting | ⚠️ Basic | ✅ Advanced (per-user, per-endpoint) |
| Response Caching | ✅ | ✅ |
| API Analytics | ❌ | ✅ |
| Service Discovery | ❌ | ✅ |
| Circuit Breaking | ❌ | ✅ |
Architecture Patterns
1. Single Gateway Pattern
Use case: Small to medium applications, single team
┌─────────┐
│ Clients │
└────┬────┘
│
┌────▼────────────┐
│ API Gateway │
└────┬────┬───┬───┘
│ │ │
┌──▼─┐ │ ┌▼──┐
│Auth│ │ │DB │
└──┬─┘ │ └───┘
│ │
┌──▼───▼──┐
│ Service │
└─────────┘
Pros:
- Simple architecture
- Single point of configuration
- Easy to reason about
Cons:
- Single point of failure
- Scaling limitations
- Team bottlenecks
2. Gateway Per Team Pattern
Use case: Large organizations, multiple autonomous teams
┌──────────────────┐
│ Clients │
└────┬─────────┬───┘
│ │
┌────▼─────┐ ┌▼──────────┐
│ Gateway │ │ Gateway │
│ (Team A) │ │ (Team B) │
└────┬─────┘ └┬──────────┘
│ │
┌────▼────┐ ┌─▼─────────┐
│Services │ │ Services │
│(Team A) │ │ (Team B) │
└─────────┘ └───────────┘
Pros:
- Team autonomy
- Independent deployments
- Isolated failures
Cons:
- Client complexity
- Duplicate configuration
- Cross-team coordination
3. Backend for Frontend (BFF) Pattern
Use case: Multiple client types (web, mobile, IoT)
┌────┐ ┌─────┐ ┌────┐
│Web │ │Mobile│ │IoT │
└─┬──┘ └──┬──┘ └─┬──┘
│ │ │
┌─▼──┐ ┌──▼──┐ ┌▼───┐
│Web │ │Mobile│ │IoT │
│BFF │ │ BFF │ │BFF │
└─┬──┘ └──┬──┘ └┬───┘
│ │ │
└───────┼──────┘
│
┌─────▼──────┐
│ Services │
└────────────┘
Pros:
- Optimized for each client
- Independent evolution
- Better performance
Cons:
- More complexity
- Code duplication
- Maintenance overhead
4. Micro Gateway Pattern
Use case: Kubernetes/service mesh environments
┌─────────────┐
│ Ingress │
└──────┬──────┘
│
┌──────▼──────┐
│ API Gateway │ (Global)
└──┬───┬───┬──┘
│ │ │
┌──▼┐ ┌▼─┐ ┌▼──┐
│MG1│ │MG2│ │MG3│ (Micro Gateways)
└─┬─┘ └┬──┘ └┬──┘
│ │ │
┌─▼─┐ ┌▼──┐ ┌▼──┐
│Svc│ │Svc│ │Svc│
└───┘ └───┘ └───┘
Pros:
- Decentralized control
- Service-level policies
- Better resilience
Cons:
- Complex orchestration
- Harder debugging
- More moving parts
Implementation: Kong Gateway
Installation (Docker)
# docker-compose.yml
version: '3.8'
services:
kong-database:
image: postgres:15
environment:
POSTGRES_DB: kong
POSTGRES_USER: kong
POSTGRES_PASSWORD: kong
volumes:
- kong_data:/var/lib/postgresql/data
networks:
- kong-net
kong-migrations:
image: kong:3.5
command: kong migrations bootstrap
environment:
KONG_DATABASE: postgres
KONG_PG_HOST: kong-database
KONG_PG_USER: kong
KONG_PG_PASSWORD: kong
depends_on:
- kong-database
networks:
- kong-net
kong:
image: kong:3.5
environment:
KONG_DATABASE: postgres
KONG_PG_HOST: kong-database
KONG_PG_USER: kong
KONG_PG_PASSWORD: kong
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_ADMIN_LISTEN: 0.0.0.0:8001
KONG_ADMIN_GUI_URL: http://localhost:8002
ports:
- "8000:8000" # Proxy
- "8443:8443" # Proxy SSL
- "8001:8001" # Admin API
- "8002:8002" # Admin GUI
depends_on:
- kong-database
- kong-migrations
networks:
- kong-net
volumes:
kong_data:
networks:
kong-net:
Basic Configuration
# Start Kong
docker-compose up -d
Create a service
curl -i -X POST http://localhost:8001/services
--data name=user-service
--data url=http://backend:3000
Create a route
curl -i -X POST http://localhost:8001/services/user-service/routes
--data 'paths[]=/users'
--data name=user-route
Test
curl http://localhost:8000/users
Authentication: JWT Plugin
# Enable JWT plugin
curl -X POST http://localhost:8001/services/user-service/plugins \
--data "name=jwt"
Create consumer
curl -X POST http://localhost:8001/consumers
--data "username=john"
Create JWT credential
curl -X POST http://localhost:8001/consumers/john/jwt
--data "key=john-api-key"
--data "secret=my-secret-key"
Generate JWT (Node.js):
const jwt = require('jsonwebtoken');
const token = jwt.sign(
{ sub: 'john', exp: Math.floor(Date.now() / 1000) + 3600 },
'my-secret-key',
{ header: { iss: 'john-api-key' } }
);
console.log(token);
Make authenticated request:
curl -H "Authorization: Bearer <JWT_TOKEN>" \
http://localhost:8000/users
Rate Limiting
# Apply rate limiting (100 requests per minute)
curl -X POST http://localhost:8001/services/user-service/plugins \
--data "name=rate-limiting" \
--data "config.minute=100" \
--data "config.policy=local"
Per-consumer rate limiting
curl -X POST http://localhost:8001/plugins
--data "name=rate-limiting"
--data "consumer.id=<CONSUMER_ID>"
--data "config.minute=50"
Response Caching
# Enable proxy caching
curl -X POST http://localhost:8001/services/user-service/plugins \
--data "name=proxy-cache" \
--data "config.strategy=memory" \
--data "config.content_type[]=application/json" \
--data "config.cache_ttl=300"
Request Transformation
# Add headers to upstream
curl -X POST http://localhost:8001/services/user-service/plugins \
--data "name=request-transformer" \
--data "config.add.headers=X-Request-ID:$(uuidgen)" \
--data "config.add.headers=X-Gateway:Kong"
Remove sensitive headers from response
curl -X POST http://localhost:8001/services/user-service/plugins
--data "name=response-transformer"
--data "config.remove.headers=X-Internal-Token"
Implementation: AWS API Gateway
REST API Configuration (Terraform)
# api_gateway.tf
resource "aws_api_gateway_rest_api" "main" {
name = "production-api"
description = "Production API Gateway"
endpoint_configuration
}
Users resource
resource "aws_api_gateway_resource" "users"
GET /users method
resource "aws_api_gateway_method" "get_users" {
rest_api_id = aws_api_gateway_rest_api.main.id
resource_id = aws_api_gateway_resource.users.id
http_method = "GET"
authorization = "COGNITO_USER_POOLS"
authorizer_id = aws_api_gateway_authorizer.cognito.id
request_parameters = {
"method.request.querystring.limit" = false
"method.request.querystring.offset" = false
}
}
Lambda integration
resource "aws_api_gateway_integration" "users_lambda"
Cognito authorizer
resource "aws_api_gateway_authorizer" "cognito"
Usage plan with rate limiting
resource "aws_api_gateway_usage_plan" "standard" {
name = "standard-plan"
api_stages
throttle_settings
quota_settings
}
Deploy
resource "aws_api_gateway_deployment" "prod" {
rest_api_id = aws_api_gateway_rest_api.main.id
triggers =
lifecycle
}
resource "aws_api_gateway_stage" "prod" {
deployment_id = aws_api_gateway_deployment.prod.id
rest_api_id = aws_api_gateway_rest_api.main.id
stage_name = "prod"
Enable caching
cache_cluster_enabled = true
cache_cluster_size = "0.5" # 0.5GB
Enable detailed CloudWatch metrics
xray_tracing_enabled = true
access_log_settings)
}
}
HTTP API (WebSocket Support)
# HTTP API (cheaper, simpler)
resource "aws_apigatewayv2_api" "http" {
name = "http-api"
protocol_type = "HTTP"
cors_configuration {
allow_origins = ["https://example.com"]
allow_methods = ["GET", "POST", "PUT", "DELETE"]
allow_headers = ["Authorization", "Content-Type"]
max_age = 3600
}
}
resource "aws_apigatewayv2_integration" "lambda"
resource "aws_apigatewayv2_route" "get_users" {
api_id = aws_apigatewayv2_api.http.id
route_key = "GET /users"
target = "integrations/${aws_apigatewayv2_integration.lambda.id}"
authorization_type = "JWT"
authorizer_id = aws_apigatewayv2_authorizer.jwt.id
}
JWT authorizer
resource "aws_apigatewayv2_authorizer" "jwt" {
api_id = aws_apigatewayv2_api.http.id
authorizer_type = "JWT"
identity_sources = ["$request.header.Authorization"]
name = "jwt-authorizer"
jwt_configuration
}
resource "aws_apigatewayv2_stage" "prod" {
api_id = aws_apigatewayv2_api.http.id
name = "prod"
auto_deploy = true
default_route_settings
access_log_settings)
}
}
Implementation: Azure API Management
Bicep Deployment
// apim.bicep
param location string = resourceGroup().location
param apimName string = 'prod-apim'
resource apim 'Microsoft.ApiManagement/service@2023-05-01-preview' = {
name: apimName
location: location
sku: {
name: 'Developer'
capacity: 1
}
properties: {
publisherEmail: 'admin@example.com'
publisherName: 'Example Corp'
}
}
// Backend service
resource backend 'Microsoft.ApiManagement/service/backends@2023-05-01-preview' = {
parent: apim
name: 'user-service'
properties: {
url: 'https://backend.example.com'
protocol: 'http'
tls: {
validateCertificateChain: true
validateCertificateName: true
}
}
}
// API definition
resource api 'Microsoft.ApiManagement/service/apis@2023-05-01-preview' = {
parent: apim
name: 'users-api'
properties: {
displayName: 'Users API'
path: 'users'
protocols: ['https']
subscriptionRequired: true
serviceUrl: 'https://backend.example.com'
}
}
// GET /users operation
resource getUsers 'Microsoft.ApiManagement/service/apis/operations@2023-05-01-preview' = {
parent: api
name: 'get-users'
properties: {
displayName: 'Get Users'
method: 'GET'
urlTemplate: '/'
responses: [
{
statusCode: 200
description: 'Success'
}
]
}
}
// Policy: JWT validation + rate limiting
resource policy 'Microsoft.ApiManagement/service/apis/operations/policies@2023-05-01-preview' = {
parent: getUsers
name: 'policy'
properties: {
value: '''
<policies>
<inbound>
<base />
<!-- JWT validation -->
<validate-jwt header-name="Authorization" failed-validation-httpcode="401">
<openid-config url="https://login.microsoftonline.com/{tenant-id}/v2.0/.well-known/openid-configuration" />
<required-claims>
<claim name="aud">
<value>api.example.com</value>
</claim>
</required-claims>
</validate-jwt>
<!-- Rate limiting: 100 calls per minute -->
<rate-limit calls="100" renewal-period="60" />
<!-- Caching -->
<cache-lookup vary-by-developer="false" vary-by-developer-groups="false" downstream-caching-type="none">
<vary-by-query-parameter>limit</vary-by-query-parameter>
<vary-by-query-parameter>offset</vary-by-query-parameter>
</cache-lookup>
<!-- Add correlation ID -->
<set-header name="X-Correlation-ID" exists-action="override">
<value>@(Guid.NewGuid().ToString())</value>
</set-header>
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
<!-- Cache response for 5 minutes -->
<cache-store duration="300" />
<!-- Remove internal headers -->
<set-header name="X-Internal-Token" exists-action="delete" />
</outbound>
<on-error>
<base />
</on-error>
</policies>
'''
format: 'xml'
}
}
// Product (usage tier)
resource product 'Microsoft.ApiManagement/service/products@2023-05-01-preview' = {
parent: apim
name: 'standard'
properties: {
displayName: 'Standard Tier'
description: 'Standard API access'
subscriptionRequired: true
approvalRequired: false
state: 'published'
subscriptionsLimit: 1000
}
}
// Link API to product
resource productApi 'Microsoft.ApiManagement/service/products/apis@2023-05-01-preview' = {
parent: product
name: api.name
}
Implementation: NGINX as API Gateway
Configuration
# /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 4096;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format json_combined escape=json '{'
'"time_local":"$time_local",'
'"remote_addr":"$remote_addr",'
'"request":"$request",'
'"status":$status,'
'"body_bytes_sent":$body_bytes_sent,'
'"request_time":$request_time,'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_addr":"$upstream_addr"'
'}';
access_log /var/log/nginx/access.log json_combined;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# Rate limiting zones
limit_req_zone $binary_remote_addr zone=global:10m rate=100r/m;
limit_req_zone $http_authorization zone=per_user:10m rate=50r/m;
# Caching
proxy_cache_path /var/cache/nginx/api
levels=1:2
keys_zone=api_cache:10m
max_size=1g
inactive=60m
use_temp_path=off;
# Upstream services
upstream user_service {
least_conn;
server backend1:3000 max_fails=3 fail_timeout=30s;
server backend2:3000 max_fails=3 fail_timeout=30s;
server backend3:3000 max_fails=3 fail_timeout=30s backup;
keepalive 32;
}
# JWT validation (requires lua module)
lua_shared_dict jwks 1m;
lua_package_path "/etc/nginx/lua/?.lua;;";
server {
listen 80;
server_name api.example.com;
# Redirect HTTP to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
add_header X-XSS-Protection "1; mode=block" always;
# CORS
add_header Access-Control-Allow-Origin "https://example.com" always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type" always;
add_header Access-Control-Max-Age "3600" always;
if ($request_method = 'OPTIONS') {
return 204;
}
# Health check endpoint
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
# Users API
location /users {
# Rate limiting
limit_req zone=global burst=20 nodelay;
limit_req zone=per_user burst=10 nodelay;
# JWT validation (Lua)
access_by_lua_block {
local jwt = require "resty.jwt"
local jwt_token = ngx.var.http_authorization
if not jwt_token then
ngx.status = 401
ngx.say('{"error":"Missing Authorization header"}')
ngx.exit(401)
end
jwt_token = jwt_token:gsub("Bearer ", "")
local jwt_obj = jwt:verify("your-secret-key", jwt_token)
if not jwt_obj.verified then
ngx.status = 401
ngx.say('{"error":"Invalid token"}')
ngx.exit(401)
end
}
# Caching
proxy_cache api_cache;
proxy_cache_key "$request_uri$http_authorization";
proxy_cache_valid 200 5m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_background_update on;
proxy_cache_lock on;
add_header X-Cache-Status $upstream_cache_status;
# Proxy settings
proxy_pass http://user_service;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Request-ID $request_id;
# Timeouts
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 10s;
# Buffering
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 4k;
}
}
}
Security Best Practices
1. Authentication Strategies
API Keys (Simplest)
✅ Use for: Internal services, low-security public APIs
❌ Avoid for: User authentication, sensitive data
JWT (Recommended)
✅ Use for: User authentication, microservices
❌ Avoid for: Very high-security scenarios (use short-lived tokens)
OAuth 2.0 (Most Secure)
✅ Use for: Third-party integrations, delegated access
❌ Avoid for: Simple internal APIs (overkill)
mTLS (Maximum Security)
✅ Use for: Service-to-service, zero-trust networks
❌ Avoid for: Public-facing APIs (complexity)
2. Input Validation
# NGINX: Block malicious patterns
location /api {
# Block SQL injection attempts
if ($request_uri ~* "(union|select|insert|update|delete|drop|exec|script)") {
return 403;
}
# Block path traversal
if ($request_uri ~* "\.\./") {
return 403;
}
# Limit request body size
client_max_body_size 1m;
proxy_pass http://backend;
}
3. DDoS Protection
Rate Limiting Tiers:
Global: 1000 req/s (all traffic)
Per-IP: 100 req/min
Per-User: 50 req/min
Per-Endpoint: Varies (e.g., /login → 5 req/min)
Circuit Breaking (Kong):
curl -X POST http://localhost:8001/services/user-service/plugins \
--data "name=circuit-breaker" \
--data "config.failure_threshold=5" \
--data "config.success_threshold=3" \
--data "config.timeout=30"
4. Secrets Management
❌ DON'T:
# Bad: Hardcoded secrets
environment:
JWT_SECRET: "my-secret-123"
DB_PASSWORD: "password"
✅ DO:
# Good: Use secrets managers
environment:
JWT_SECRET: ${AWS_SECRETS_MANAGER:prod/jwt-secret}
DB_PASSWORD: ${VAULT:database/password}
Performance Optimization
1. Connection Pooling
Kong:
-- /etc/kong/kong.conf
upstream_keepalive_pool_size = 100
upstream_keepalive_max_requests = 1000
upstream_keepalive_idle_timeout = 60
NGINX:
upstream backend {
server backend:3000;
keepalive 32; # Connection pool
}
location / {
proxy_http_version 1.1;
proxy_set_header Connection ""; # Enable keepalive
proxy_pass http://backend;
}
2. Response Compression
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types
text/plain
text/css
text/xml
text/javascript
application/json
application/javascript
application/xml+rss;
3. Caching Strategies
Cache-Control Headers:
Public endpoints: Cache-Control: public, max-age=300
User-specific: Cache-Control: private, max-age=60
Dynamic data: Cache-Control: no-cache
Sensitive data: Cache-Control: no-store
Cache Invalidation (Kong):
# Invalidate cache for specific endpoint
curl -X DELETE http://localhost:8001/proxy-cache/{cache-key}
Purge all cache
curl -X DELETE http://localhost:8001/proxy-cache
4. Load Balancing Algorithms
| Algorithm | Use Case |
|---|---|
| Round Robin | Equal server capacity |
| Least Connections | Varying request complexity |
| IP Hash | Session affinity needed |
| Weighted | Heterogeneous servers |
Monitoring & Observability
Metrics to Track
Gateway Metrics:
- Request rate (req/s)
- Error rate (4xx, 5xx %)
- Latency (p50, p95, p99)
- Cache hit rate
Upstream Metrics:
- Response time
- Failure rate
- Circuit breaker state
Resource Metrics:
- CPU usage
- Memory usage
- Connection pool exhaustion
Prometheus Integration (Kong)
# kong.yml
plugins:
- name: prometheus
config:
per_consumer: true
Query Examples:
# Request rate by service
rate(kong_http_requests_total[5m])
Error rate
rate(kong_http_requests_total[5m]) / rate(kong_http_requests_total[5m])
Latency (95th percentile)
histogram_quantile(0.95, rate(kong_latency_bucket[5m]))
Cache hit rate
rate(kong_bandwidth[5m]) / rate(kong_bandwidth[5m])
Distributed Tracing
OpenTelemetry Integration:
# AWS API Gateway
resource "aws_api_gateway_stage" "prod" {
xray_tracing_enabled = true
}
Kong
plugins:
- name: opentelemetry
config:
endpoint: "http://jaeger:4318/v1/traces"
resource_attributes:
service.name: "api-gateway"
Alerting Rules
# Prometheus alerts
groups:
- name: api_gateway
rules:
- alert: HighErrorRate
expr: rate(kong_http_requests_total{code=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: HighLatency
expr: histogram_quantile(0.95, rate(kong_latency_bucket[5m])) > 1000
for: 10m
labels:
severity: warning
annotations:
summary: "High latency (p95 > 1s)"
- alert: LowCacheHitRate
expr: rate(kong_bandwidth{type="cache-hit"}[10m]) / rate(kong_bandwidth[10m]) < 0.5
for: 15m
labels:
severity: warning
annotations:
summary: "Cache hit rate below 50%"
Production Checklist
Security
- HTTPS/TLS 1.3 enabled
- Strong authentication (JWT/OAuth/mTLS)
- Rate limiting configured
- Input validation active
- CORS properly configured
- Security headers added
- Secrets in vault (not config files)
- API keys rotated regularly
Performance
- Connection pooling enabled
- Response caching configured
- Compression enabled
- Load balancing configured
- Timeouts set appropriately
- Circuit breakers implemented
- Health checks configured
Observability
- Access logs enabled
- Metrics collection configured
- Distributed tracing active
- Dashboards created
- Alerts configured
- Log aggregation setup
High Availability
- Multiple gateway instances
- Database/state replication
- Health checks implemented
- Graceful shutdown configured
- Auto-scaling enabled
- Disaster recovery plan
Operations
- CI/CD pipeline configured
- Blue-green deployment ready
- Rollback procedure documented
- Runbooks created
- On-call rotation setup
Cost Optimization
AWS API Gateway Pricing (2025)
REST API:
- $3.50 per million requests
- $0.09/GB data transfer out
- Cache: $0.020/hour per GB
HTTP API (70% cheaper):
- $1.00 per million requests
- No caching support
- Use for: Simple proxying
Self-Hosted (Kong/NGINX)
Costs:
- EC2 instances: ~$150/month (t3.large x2)
- Load balancer: ~$20/month
- Database: ~$50/month (RDS t3.small)
Break-even point: ~50M requests/month
Conclusion
API gateways are critical infrastructure for modern applications. They centralize cross-cutting concerns, improve security, enable observability, and reduce backend complexity.
Key takeaways:
- Choose the right pattern - Single gateway for simplicity, BFF for client optimization, micro-gateways for scale
- Security is paramount - Use strong authentication, validate inputs, implement rate limiting
- Cache aggressively - 5-minute caching can reduce load by 80%+
- Monitor everything - Track gateway, upstream, and resource metrics
- Plan for failure - Circuit breakers, health checks, graceful degradation
- Start simple - Add complexity only when needed
Whether you choose a managed solution (AWS API Gateway, Azure APIM) or self-hosted (Kong, NGINX), the principles remain the same: security, performance, and reliability.
Additional Resources
- Kong Gateway: https://docs.konghq.com/
- AWS API Gateway: https://docs.aws.amazon.com/apigateway/
- Azure API Management: https://docs.microsoft.com/azure/api-management/
- NGINX: https://nginx.org/en/docs/
- OpenAPI Specification: https://spec.openapis.org/
- OAuth 2.0: https://oauth.net/2/
- JWT Best Practices: https://datatracker.ietf.org/doc/html/rfc8725
Related Articles
GraphQL API Design - Production Architecture and Best Practices for Scalable Systems
Master GraphQL API design covering schema design principles, resolver optimization, N+1 query prevention with DataLoader, authentication and authorization patterns, caching strategies, error handling, and production deployment for high-performance GraphQL systems.
Testing Strategies - Unit, Integration, and E2E Testing Best Practices for Production Quality
Comprehensive guide to testing strategies covering unit tests, integration tests, end-to-end testing, test-driven development, mocking patterns, testing pyramid, and production testing practices for reliable software delivery.
Monitoring and Observability - Production Systems Performance and Debugging at Scale
Master monitoring and observability covering metrics collection with Prometheus, distributed tracing with OpenTelemetry, log aggregation, alerting strategies, SLOs/SLIs, and production debugging techniques for reliable systems.
Written by StaticBlock Editorial
StaticBlock Editorial is a technical writer and software engineer specializing in web development, performance optimization, and developer tooling.