0% read
Skip to main content
API Gateway Architecture: Complete Implementation Guide for Production Systems

API Gateway Architecture: Complete Implementation Guide for Production Systems

Master API gateway architecture with this comprehensive production guide. Learn routing patterns, authentication strategies, rate limiting, caching, load balancing, and security best practices. Includes implementation examples for Kong, AWS API Gateway, Azure API Management, and NGINX with real-world production configurations, monitoring strategies, and performance optimization techniques.

S
StaticBlock Editorial
20 min read

Introduction

Your microservices architecture is growing. You now have 15 services, each implementing its own authentication, rate limiting, and logging. A security audit reveals three services using outdated JWT libraries. Performance testing shows duplicate calls because caching isn't standardized. Your ops team is drowning in configuration drift across services.

This is exactly why API gateways exist.

An API gateway acts as the single entry point for all client requests, centralizing cross-cutting concerns like authentication, rate limiting, caching, and routing. Instead of each service reimplementing these responsibilities, the gateway handles them once—consistently, efficiently, and securely.

The impact is substantial: Companies report 40% reduction in backend code, 10x improvement in incident response time, and 60% faster time-to-market for new APIs after adopting gateway patterns.

This comprehensive guide covers API gateway architecture from fundamentals to production deployment, with real-world examples using Kong, AWS API Gateway, Azure API Management, and NGINX.

What is an API Gateway?

Core Responsibilities

An API gateway sits between clients and backend services, handling:

1. Request Routing

Client → API Gateway → Service Discovery → Backend Service

2. Authentication & Authorization

  • JWT validation
  • OAuth 2.0 flows
  • API key management
  • mTLS verification

3. Rate Limiting & Throttling

  • Per-client limits
  • Global quotas
  • Burst handling

4. Response Transformation

  • Protocol translation (REST ↔ gRPC)
  • Response aggregation
  • Data masking

5. Caching

  • Response caching
  • Cache invalidation
  • Edge caching

6. Monitoring & Logging

  • Request/response logging
  • Metrics collection
  • Distributed tracing

Gateway vs Reverse Proxy

Feature Reverse Proxy (NGINX) API Gateway
Routing ✅ Basic ✅ Advanced (path, header, query)
Load Balancing
SSL Termination
Authentication ⚠️ Limited ✅ Full (JWT, OAuth, API keys)
Rate Limiting ⚠️ Basic ✅ Advanced (per-user, per-endpoint)
Response Caching
API Analytics
Service Discovery
Circuit Breaking

Architecture Patterns

1. Single Gateway Pattern

Use case: Small to medium applications, single team

┌─────────┐
│ Clients │
└────┬────┘
     │
┌────▼────────────┐
│   API Gateway   │
└────┬────┬───┬───┘
     │    │   │
  ┌──▼─┐ │  ┌▼──┐
  │Auth│ │  │DB │
  └──┬─┘ │  └───┘
     │   │
  ┌──▼───▼──┐
  │ Service │
  └─────────┘

Pros:

  • Simple architecture
  • Single point of configuration
  • Easy to reason about

Cons:

  • Single point of failure
  • Scaling limitations
  • Team bottlenecks

2. Gateway Per Team Pattern

Use case: Large organizations, multiple autonomous teams

┌──────────────────┐
│     Clients      │
└────┬─────────┬───┘
     │         │
┌────▼─────┐ ┌▼──────────┐
│ Gateway  │ │ Gateway   │
│ (Team A) │ │ (Team B)  │
└────┬─────┘ └┬──────────┘
     │         │
┌────▼────┐ ┌─▼─────────┐
│Services │ │ Services  │
│(Team A) │ │ (Team B)  │
└─────────┘ └───────────┘

Pros:

  • Team autonomy
  • Independent deployments
  • Isolated failures

Cons:

  • Client complexity
  • Duplicate configuration
  • Cross-team coordination

3. Backend for Frontend (BFF) Pattern

Use case: Multiple client types (web, mobile, IoT)

┌────┐ ┌─────┐ ┌────┐
│Web │ │Mobile│ │IoT │
└─┬──┘ └──┬──┘ └─┬──┘
  │       │      │
┌─▼──┐ ┌──▼──┐ ┌▼───┐
│Web │ │Mobile│ │IoT │
│BFF │ │ BFF │ │BFF │
└─┬──┘ └──┬──┘ └┬───┘
  │       │      │
  └───────┼──────┘
          │
    ┌─────▼──────┐
    │  Services  │
    └────────────┘

Pros:

  • Optimized for each client
  • Independent evolution
  • Better performance

Cons:

  • More complexity
  • Code duplication
  • Maintenance overhead

4. Micro Gateway Pattern

Use case: Kubernetes/service mesh environments

┌─────────────┐
│   Ingress   │
└──────┬──────┘
       │
┌──────▼──────┐
│ API Gateway │ (Global)
└──┬───┬───┬──┘
   │   │   │
┌──▼┐ ┌▼─┐ ┌▼──┐
│MG1│ │MG2│ │MG3│ (Micro Gateways)
└─┬─┘ └┬──┘ └┬──┘
  │    │     │
┌─▼─┐ ┌▼──┐ ┌▼──┐
│Svc│ │Svc│ │Svc│
└───┘ └───┘ └───┘

Pros:

  • Decentralized control
  • Service-level policies
  • Better resilience

Cons:

  • Complex orchestration
  • Harder debugging
  • More moving parts

Implementation: Kong Gateway

Installation (Docker)

# docker-compose.yml
version: '3.8'

services: kong-database: image: postgres:15 environment: POSTGRES_DB: kong POSTGRES_USER: kong POSTGRES_PASSWORD: kong volumes: - kong_data:/var/lib/postgresql/data networks: - kong-net

kong-migrations: image: kong:3.5 command: kong migrations bootstrap environment: KONG_DATABASE: postgres KONG_PG_HOST: kong-database KONG_PG_USER: kong KONG_PG_PASSWORD: kong depends_on: - kong-database networks: - kong-net

kong: image: kong:3.5 environment: KONG_DATABASE: postgres KONG_PG_HOST: kong-database KONG_PG_USER: kong KONG_PG_PASSWORD: kong KONG_PROXY_ACCESS_LOG: /dev/stdout KONG_ADMIN_ACCESS_LOG: /dev/stdout KONG_PROXY_ERROR_LOG: /dev/stderr KONG_ADMIN_ERROR_LOG: /dev/stderr KONG_ADMIN_LISTEN: 0.0.0.0:8001 KONG_ADMIN_GUI_URL: http://localhost:8002 ports: - "8000:8000" # Proxy - "8443:8443" # Proxy SSL - "8001:8001" # Admin API - "8002:8002" # Admin GUI depends_on: - kong-database - kong-migrations networks: - kong-net

volumes: kong_data:

networks: kong-net:

Basic Configuration

# Start Kong
docker-compose up -d

Create a service

curl -i -X POST http://localhost:8001/services
--data name=user-service
--data url=http://backend:3000

Create a route

curl -i -X POST http://localhost:8001/services/user-service/routes
--data 'paths[]=/users'
--data name=user-route

Test

curl http://localhost:8000/users

Authentication: JWT Plugin

# Enable JWT plugin
curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=jwt"

Create consumer

curl -X POST http://localhost:8001/consumers
--data "username=john"

Create JWT credential

curl -X POST http://localhost:8001/consumers/john/jwt
--data "key=john-api-key"
--data "secret=my-secret-key"

Generate JWT (Node.js):

const jwt = require('jsonwebtoken');

const token = jwt.sign( { sub: 'john', exp: Math.floor(Date.now() / 1000) + 3600 }, 'my-secret-key', { header: { iss: 'john-api-key' } } );

console.log(token);

Make authenticated request:

curl -H "Authorization: Bearer <JWT_TOKEN>" \
  http://localhost:8000/users

Rate Limiting

# Apply rate limiting (100 requests per minute)
curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=rate-limiting" \
  --data "config.minute=100" \
  --data "config.policy=local"

Per-consumer rate limiting

curl -X POST http://localhost:8001/plugins
--data "name=rate-limiting"
--data "consumer.id=<CONSUMER_ID>"
--data "config.minute=50"

Response Caching

# Enable proxy caching
curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=proxy-cache" \
  --data "config.strategy=memory" \
  --data "config.content_type[]=application/json" \
  --data "config.cache_ttl=300"

Request Transformation

# Add headers to upstream
curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=request-transformer" \
  --data "config.add.headers=X-Request-ID:$(uuidgen)" \
  --data "config.add.headers=X-Gateway:Kong"

Remove sensitive headers from response

curl -X POST http://localhost:8001/services/user-service/plugins
--data "name=response-transformer"
--data "config.remove.headers=X-Internal-Token"

Implementation: AWS API Gateway

REST API Configuration (Terraform)

# api_gateway.tf
resource "aws_api_gateway_rest_api" "main" {
  name        = "production-api"
  description = "Production API Gateway"

endpoint_configuration }

Users resource

resource "aws_api_gateway_resource" "users"

GET /users method

resource "aws_api_gateway_method" "get_users" { rest_api_id = aws_api_gateway_rest_api.main.id resource_id = aws_api_gateway_resource.users.id http_method = "GET" authorization = "COGNITO_USER_POOLS" authorizer_id = aws_api_gateway_authorizer.cognito.id

request_parameters = { "method.request.querystring.limit" = false "method.request.querystring.offset" = false } }

Lambda integration

resource "aws_api_gateway_integration" "users_lambda"

Cognito authorizer

resource "aws_api_gateway_authorizer" "cognito"

Usage plan with rate limiting

resource "aws_api_gateway_usage_plan" "standard" { name = "standard-plan"

api_stages

throttle_settings

quota_settings }

Deploy

resource "aws_api_gateway_deployment" "prod" { rest_api_id = aws_api_gateway_rest_api.main.id

triggers =

lifecycle }

resource "aws_api_gateway_stage" "prod" { deployment_id = aws_api_gateway_deployment.prod.id rest_api_id = aws_api_gateway_rest_api.main.id stage_name = "prod"

Enable caching

cache_cluster_enabled = true cache_cluster_size = "0.5" # 0.5GB

Enable detailed CloudWatch metrics

xray_tracing_enabled = true

access_log_settings) } }

HTTP API (WebSocket Support)

# HTTP API (cheaper, simpler)
resource "aws_apigatewayv2_api" "http" {
  name          = "http-api"
  protocol_type = "HTTP"

cors_configuration { allow_origins = ["https://example.com"] allow_methods = ["GET", "POST", "PUT", "DELETE"] allow_headers = ["Authorization", "Content-Type"] max_age = 3600 } }

resource "aws_apigatewayv2_integration" "lambda"

resource "aws_apigatewayv2_route" "get_users" { api_id = aws_apigatewayv2_api.http.id route_key = "GET /users" target = "integrations/${aws_apigatewayv2_integration.lambda.id}"

authorization_type = "JWT" authorizer_id = aws_apigatewayv2_authorizer.jwt.id }

JWT authorizer

resource "aws_apigatewayv2_authorizer" "jwt" { api_id = aws_apigatewayv2_api.http.id authorizer_type = "JWT" identity_sources = ["$request.header.Authorization"] name = "jwt-authorizer"

jwt_configuration }

resource "aws_apigatewayv2_stage" "prod" { api_id = aws_apigatewayv2_api.http.id name = "prod" auto_deploy = true

default_route_settings

access_log_settings) } }

Implementation: Azure API Management

Bicep Deployment

// apim.bicep
param location string = resourceGroup().location
param apimName string = 'prod-apim'

resource apim 'Microsoft.ApiManagement/service@2023-05-01-preview' = { name: apimName location: location sku: { name: 'Developer' capacity: 1 } properties: { publisherEmail: 'admin@example.com' publisherName: 'Example Corp' } }

// Backend service resource backend 'Microsoft.ApiManagement/service/backends@2023-05-01-preview' = { parent: apim name: 'user-service' properties: { url: 'https://backend.example.com' protocol: 'http' tls: { validateCertificateChain: true validateCertificateName: true } } }

// API definition resource api 'Microsoft.ApiManagement/service/apis@2023-05-01-preview' = { parent: apim name: 'users-api' properties: { displayName: 'Users API' path: 'users' protocols: ['https'] subscriptionRequired: true serviceUrl: 'https://backend.example.com' } }

// GET /users operation resource getUsers 'Microsoft.ApiManagement/service/apis/operations@2023-05-01-preview' = { parent: api name: 'get-users' properties: { displayName: 'Get Users' method: 'GET' urlTemplate: '/' responses: [ { statusCode: 200 description: 'Success' } ] } }

// Policy: JWT validation + rate limiting resource policy 'Microsoft.ApiManagement/service/apis/operations/policies@2023-05-01-preview' = { parent: getUsers name: 'policy' properties: { value: ''' <policies> <inbound> <base /> <!-- JWT validation --> <validate-jwt header-name="Authorization" failed-validation-httpcode="401"> <openid-config url="https://login.microsoftonline.com/{tenant-id}/v2.0/.well-known/openid-configuration" /> <required-claims> <claim name="aud"> <value>api.example.com</value> </claim> </required-claims> </validate-jwt>

      &lt;!-- Rate limiting: 100 calls per minute --&gt;
      &lt;rate-limit calls=&quot;100&quot; renewal-period=&quot;60&quot; /&gt;

      &lt;!-- Caching --&gt;
      &lt;cache-lookup vary-by-developer=&quot;false&quot; vary-by-developer-groups=&quot;false&quot; downstream-caching-type=&quot;none&quot;&gt;
        &lt;vary-by-query-parameter&gt;limit&lt;/vary-by-query-parameter&gt;
        &lt;vary-by-query-parameter&gt;offset&lt;/vary-by-query-parameter&gt;
      &lt;/cache-lookup&gt;

      &lt;!-- Add correlation ID --&gt;
      &lt;set-header name=&quot;X-Correlation-ID&quot; exists-action=&quot;override&quot;&gt;
        &lt;value&gt;@(Guid.NewGuid().ToString())&lt;/value&gt;
      &lt;/set-header&gt;
    &lt;/inbound&gt;

    &lt;backend&gt;
      &lt;base /&gt;
    &lt;/backend&gt;

    &lt;outbound&gt;
      &lt;base /&gt;
      &lt;!-- Cache response for 5 minutes --&gt;
      &lt;cache-store duration=&quot;300&quot; /&gt;

      &lt;!-- Remove internal headers --&gt;
      &lt;set-header name=&quot;X-Internal-Token&quot; exists-action=&quot;delete&quot; /&gt;
    &lt;/outbound&gt;

    &lt;on-error&gt;
      &lt;base /&gt;
    &lt;/on-error&gt;
  &lt;/policies&gt;
'''
format: 'xml'

} }

// Product (usage tier) resource product 'Microsoft.ApiManagement/service/products@2023-05-01-preview' = { parent: apim name: 'standard' properties: { displayName: 'Standard Tier' description: 'Standard API access' subscriptionRequired: true approvalRequired: false state: 'published' subscriptionsLimit: 1000 } }

// Link API to product resource productApi 'Microsoft.ApiManagement/service/products/apis@2023-05-01-preview' = { parent: product name: api.name }

Implementation: NGINX as API Gateway

Configuration

# /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events { worker_connections 4096; }

http { include /etc/nginx/mime.types; default_type application/octet-stream;

log_format json_combined escape=json '{'
    '&quot;time_local&quot;:&quot;$time_local&quot;,'
    '&quot;remote_addr&quot;:&quot;$remote_addr&quot;,'
    '&quot;request&quot;:&quot;$request&quot;,'
    '&quot;status&quot;:$status,'
    '&quot;body_bytes_sent&quot;:$body_bytes_sent,'
    '&quot;request_time&quot;:$request_time,'
    '&quot;upstream_response_time&quot;:&quot;$upstream_response_time&quot;,'
    '&quot;upstream_addr&quot;:&quot;$upstream_addr&quot;'
'}';

access_log /var/log/nginx/access.log json_combined;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;

# Rate limiting zones
limit_req_zone $binary_remote_addr zone=global:10m rate=100r/m;
limit_req_zone $http_authorization zone=per_user:10m rate=50r/m;

# Caching
proxy_cache_path /var/cache/nginx/api
    levels=1:2
    keys_zone=api_cache:10m
    max_size=1g
    inactive=60m
    use_temp_path=off;

# Upstream services
upstream user_service {
    least_conn;
    server backend1:3000 max_fails=3 fail_timeout=30s;
    server backend2:3000 max_fails=3 fail_timeout=30s;
    server backend3:3000 max_fails=3 fail_timeout=30s backup;
    keepalive 32;
}

# JWT validation (requires lua module)
lua_shared_dict jwks 1m;
lua_package_path &quot;/etc/nginx/lua/?.lua;;&quot;;

server {
    listen 80;
    server_name api.example.com;

    # Redirect HTTP to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;

    # Security headers
    add_header Strict-Transport-Security &quot;max-age=31536000; includeSubDomains&quot; always;
    add_header X-Content-Type-Options &quot;nosniff&quot; always;
    add_header X-Frame-Options &quot;DENY&quot; always;
    add_header X-XSS-Protection &quot;1; mode=block&quot; always;

    # CORS
    add_header Access-Control-Allow-Origin &quot;https://example.com&quot; always;
    add_header Access-Control-Allow-Methods &quot;GET, POST, PUT, DELETE, OPTIONS&quot; always;
    add_header Access-Control-Allow-Headers &quot;Authorization, Content-Type&quot; always;
    add_header Access-Control-Max-Age &quot;3600&quot; always;

    if ($request_method = 'OPTIONS') {
        return 204;
    }

    # Health check endpoint
    location /health {
        access_log off;
        return 200 &quot;healthy\n&quot;;
        add_header Content-Type text/plain;
    }

    # Users API
    location /users {
        # Rate limiting
        limit_req zone=global burst=20 nodelay;
        limit_req zone=per_user burst=10 nodelay;

        # JWT validation (Lua)
        access_by_lua_block {
            local jwt = require &quot;resty.jwt&quot;
            local jwt_token = ngx.var.http_authorization

            if not jwt_token then
                ngx.status = 401
                ngx.say('{&quot;error&quot;:&quot;Missing Authorization header&quot;}')
                ngx.exit(401)
            end

            jwt_token = jwt_token:gsub(&quot;Bearer &quot;, &quot;&quot;)
            local jwt_obj = jwt:verify(&quot;your-secret-key&quot;, jwt_token)

            if not jwt_obj.verified then
                ngx.status = 401
                ngx.say('{&quot;error&quot;:&quot;Invalid token&quot;}')
                ngx.exit(401)
            end
        }

        # Caching
        proxy_cache api_cache;
        proxy_cache_key &quot;$request_uri$http_authorization&quot;;
        proxy_cache_valid 200 5m;
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;
        proxy_cache_lock on;
        add_header X-Cache-Status $upstream_cache_status;

        # Proxy settings
        proxy_pass http://user_service;
        proxy_http_version 1.1;
        proxy_set_header Connection &quot;&quot;;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Request-ID $request_id;

        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 10s;
        proxy_read_timeout 10s;

        # Buffering
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }
}

}

Security Best Practices

1. Authentication Strategies

API Keys (Simplest)

✅ Use for: Internal services, low-security public APIs
❌ Avoid for: User authentication, sensitive data

JWT (Recommended)

✅ Use for: User authentication, microservices
❌ Avoid for: Very high-security scenarios (use short-lived tokens)

OAuth 2.0 (Most Secure)

✅ Use for: Third-party integrations, delegated access
❌ Avoid for: Simple internal APIs (overkill)

mTLS (Maximum Security)

✅ Use for: Service-to-service, zero-trust networks
❌ Avoid for: Public-facing APIs (complexity)

2. Input Validation

# NGINX: Block malicious patterns
location /api {
    # Block SQL injection attempts
    if ($request_uri ~* "(union|select|insert|update|delete|drop|exec|script)") {
        return 403;
    }
# Block path traversal
if ($request_uri ~* &quot;\.\./&quot;) {
    return 403;
}

# Limit request body size
client_max_body_size 1m;

proxy_pass http://backend;

}

3. DDoS Protection

Rate Limiting Tiers:

Global: 1000 req/s (all traffic)
Per-IP: 100 req/min
Per-User: 50 req/min
Per-Endpoint: Varies (e.g., /login → 5 req/min)

Circuit Breaking (Kong):

curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=circuit-breaker" \
  --data "config.failure_threshold=5" \
  --data "config.success_threshold=3" \
  --data "config.timeout=30"

4. Secrets Management

DON'T:

# Bad: Hardcoded secrets
environment:
  JWT_SECRET: "my-secret-123"
  DB_PASSWORD: "password"

DO:

# Good: Use secrets managers
environment:
  JWT_SECRET: ${AWS_SECRETS_MANAGER:prod/jwt-secret}
  DB_PASSWORD: ${VAULT:database/password}

Performance Optimization

1. Connection Pooling

Kong:

-- /etc/kong/kong.conf
upstream_keepalive_pool_size = 100
upstream_keepalive_max_requests = 1000
upstream_keepalive_idle_timeout = 60

NGINX:

upstream backend {
    server backend:3000;
    keepalive 32;  # Connection pool
}

location / { proxy_http_version 1.1; proxy_set_header Connection ""; # Enable keepalive proxy_pass http://backend; }

2. Response Compression

gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types
    text/plain
    text/css
    text/xml
    text/javascript
    application/json
    application/javascript
    application/xml+rss;

3. Caching Strategies

Cache-Control Headers:

Public endpoints: Cache-Control: public, max-age=300
User-specific: Cache-Control: private, max-age=60
Dynamic data: Cache-Control: no-cache
Sensitive data: Cache-Control: no-store

Cache Invalidation (Kong):

# Invalidate cache for specific endpoint
curl -X DELETE http://localhost:8001/proxy-cache/{cache-key}

Purge all cache

curl -X DELETE http://localhost:8001/proxy-cache

4. Load Balancing Algorithms

Algorithm Use Case
Round Robin Equal server capacity
Least Connections Varying request complexity
IP Hash Session affinity needed
Weighted Heterogeneous servers

Monitoring & Observability

Metrics to Track

Gateway Metrics:

  • Request rate (req/s)
  • Error rate (4xx, 5xx %)
  • Latency (p50, p95, p99)
  • Cache hit rate

Upstream Metrics:

  • Response time
  • Failure rate
  • Circuit breaker state

Resource Metrics:

  • CPU usage
  • Memory usage
  • Connection pool exhaustion

Prometheus Integration (Kong)

# kong.yml
plugins:
  - name: prometheus
    config:
      per_consumer: true

Query Examples:

# Request rate by service
rate(kong_http_requests_total[5m])

Error rate

rate(kong_http_requests_total[5m]) / rate(kong_http_requests_total[5m])

Latency (95th percentile)

histogram_quantile(0.95, rate(kong_latency_bucket[5m]))

Cache hit rate

rate(kong_bandwidth[5m]) / rate(kong_bandwidth[5m])

Distributed Tracing

OpenTelemetry Integration:

# AWS API Gateway
resource "aws_api_gateway_stage" "prod" {
  xray_tracing_enabled = true
}

Kong

plugins:

  • name: opentelemetry config: endpoint: "http://jaeger:4318/v1/traces" resource_attributes: service.name: "api-gateway"

Alerting Rules

# Prometheus alerts
groups:
  - name: api_gateway
    rules:
      - alert: HighErrorRate
        expr: rate(kong_http_requests_total{code=~"5.."}[5m]) > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High error rate detected"
  - alert: HighLatency
    expr: histogram_quantile(0.95, rate(kong_latency_bucket[5m])) &gt; 1000
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: &quot;High latency (p95 &gt; 1s)&quot;

  - alert: LowCacheHitRate
    expr: rate(kong_bandwidth{type=&quot;cache-hit&quot;}[10m]) / rate(kong_bandwidth[10m]) &lt; 0.5
    for: 15m
    labels:
      severity: warning
    annotations:
      summary: &quot;Cache hit rate below 50%&quot;

Production Checklist

Security

  • HTTPS/TLS 1.3 enabled
  • Strong authentication (JWT/OAuth/mTLS)
  • Rate limiting configured
  • Input validation active
  • CORS properly configured
  • Security headers added
  • Secrets in vault (not config files)
  • API keys rotated regularly

Performance

  • Connection pooling enabled
  • Response caching configured
  • Compression enabled
  • Load balancing configured
  • Timeouts set appropriately
  • Circuit breakers implemented
  • Health checks configured

Observability

  • Access logs enabled
  • Metrics collection configured
  • Distributed tracing active
  • Dashboards created
  • Alerts configured
  • Log aggregation setup

High Availability

  • Multiple gateway instances
  • Database/state replication
  • Health checks implemented
  • Graceful shutdown configured
  • Auto-scaling enabled
  • Disaster recovery plan

Operations

  • CI/CD pipeline configured
  • Blue-green deployment ready
  • Rollback procedure documented
  • Runbooks created
  • On-call rotation setup

Cost Optimization

AWS API Gateway Pricing (2025)

REST API:

  • $3.50 per million requests
  • $0.09/GB data transfer out
  • Cache: $0.020/hour per GB

HTTP API (70% cheaper):

  • $1.00 per million requests
  • No caching support
  • Use for: Simple proxying

Self-Hosted (Kong/NGINX)

Costs:

  • EC2 instances: ~$150/month (t3.large x2)
  • Load balancer: ~$20/month
  • Database: ~$50/month (RDS t3.small)

Break-even point: ~50M requests/month

Conclusion

API gateways are critical infrastructure for modern applications. They centralize cross-cutting concerns, improve security, enable observability, and reduce backend complexity.

Key takeaways:

  1. Choose the right pattern - Single gateway for simplicity, BFF for client optimization, micro-gateways for scale
  2. Security is paramount - Use strong authentication, validate inputs, implement rate limiting
  3. Cache aggressively - 5-minute caching can reduce load by 80%+
  4. Monitor everything - Track gateway, upstream, and resource metrics
  5. Plan for failure - Circuit breakers, health checks, graceful degradation
  6. Start simple - Add complexity only when needed

Whether you choose a managed solution (AWS API Gateway, Azure APIM) or self-hosted (Kong, NGINX), the principles remain the same: security, performance, and reliability.

Additional Resources

  • Kong Gateway: https://docs.konghq.com/
  • AWS API Gateway: https://docs.aws.amazon.com/apigateway/
  • Azure API Management: https://docs.microsoft.com/azure/api-management/
  • NGINX: https://nginx.org/en/docs/
  • OpenAPI Specification: https://spec.openapis.org/
  • OAuth 2.0: https://oauth.net/2/
  • JWT Best Practices: https://datatracker.ietf.org/doc/html/rfc8725

Found this helpful? Share it!

Related Articles

S

Written by StaticBlock Editorial

StaticBlock Editorial is a technical writer and software engineer specializing in web development, performance optimization, and developer tooling.