Caching Strategies for High-Performance Applications - Production Implementation Guide

Caching is the single most effective performance optimization technique, reducing database load by 80-95% while delivering sub-millisecond response times. This comprehensive guide covers production-ready caching strategies from application-level caching to CDN edge distribution, with real-world implementation patterns used by high-traffic applications.

Why Caching Matters

Performance Impact: Properly implemented caching reduces API response times from 200-500ms to 5-15ms, improving user experience and conversion rates.

Cost Reduction: Caching reduces database queries by 90%+, allowing applications to handle 10x more traffic with the same infrastructure.

Scalability: Cache-first architectures enable horizontal scaling, with cached content served from memory instead of expensive database operations.

Twitter's timeline cache reduced database load from 500,000 queries/second to 50,000 queries/second, saving millions in infrastructure costs while improving performance.

Cache Levels and Hierarchy

Modern applications implement caching at multiple levels:

1. Browser Cache

HTTP caching headers control client-side caching, eliminating network requests entirely.

2. CDN/Edge Cache

Content delivery networks cache static assets and API responses at edge locations globally.

3. Application Cache

In-memory caching (Redis, Memcached) stores frequently accessed data close to application servers.

4. Database Query Cache

Database engines cache query results and execution plans.

Redis for Application Caching

Redis provides in-memory key-value storage with millisecond latency, supporting complex data structures and atomic operations.

Basic Redis Caching Pattern

import Redis from 'ioredis';
import { promisify } from 'util';
const redis = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT || '6379'),
password: process.env.REDIS_PASSWORD,
retryStrategy: (times) => {
const delay = Math.min(times * 50, 2000);
return delay;
}
});
// Generic cache wrapper
async function cached<T>(
key: string,
ttl: number,
fetchFunction: () => Promise<T>
): Promise<T> {
// Try to get from cache
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - fetch data
const data = await fetchFunction();
// Store in cache with TTL
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
// Usage example
async function getUser(userId: string) {
return cached(
user:${userId},
3600, // 1 hour TTL
async () => {
const user = await prisma.user.findUnique({
where: { id: userId },
include: { profile: true }
});
return user;
}
);
}
// Express endpoint with caching
app.get('/api/users/:id', async (req, res) => {
const user = await getUser(req.params.id);
if (!user) {
return res.status(404).json({ error: 'User not found' });
}
res.json(user);
});

Cache-Aside Pattern

The most common caching pattern - application checks cache first, falls back to database on miss:

class UserService {
  private readonly CACHE_TTL = 3600; // 1 hour
async findById(userId: string): Promise<User | null> {
const cacheKey = user:${userId};
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
  console.log(`Cache hit: ${cacheKey}`);
  return JSON.parse(cached);
}

console.log(`Cache miss: ${cacheKey}`);

// 2. Query database
const user = await prisma.user.findUnique({
  where: { id: userId },
  include: { profile: true, settings: true }
});

if (!user) {
  return null;
}

// 3. Store in cache
await redis.setex(cacheKey, this.CACHE_TTL, JSON.stringify(user));

return user;

}
async update(userId: string, data: UpdateUserDto): Promise<User> {
// 1. Update database
const user = await prisma.user.update({
where: { id: userId },
data
});
// 2. Invalidate cache
await redis.del(`user:${userId}`);

return user;

}
}

Write-Through Cache

Data is written to cache and database simultaneously, ensuring cache is always up-to-date:

class ProductService {
  async create(data: CreateProductDto): Promise<Product> {
    // 1. Create in database
    const product = await prisma.product.create({ data });
// 2. Write to cache immediately
const cacheKey = `product:${product.id}`;
await redis.setex(cacheKey, 3600, JSON.stringify(product));

return product;

}
async update(productId: string, data: UpdateProductDto): Promise<Product> {
// 1. Update database
const product = await prisma.product.update({
where: { id: productId },
data
});
// 2. Update cache
const cacheKey = `product:${productId}`;
await redis.setex(cacheKey, 3600, JSON.stringify(product));

return product;

}
}

Write-Behind (Write-Back) Cache

Writes go to cache first, then asynchronously to database - maximizes write performance:

class EventLogger {
  private writeQueue: Set<string> = new Set();
  private flushInterval: NodeJS.Timeout;
constructor() {
// Flush queue every 10 seconds
this.flushInterval = setInterval(() => this.flush(), 10000);
}
async logEvent(event: Event): Promise<void> {
const eventId = generateId();
const cacheKey = event:${eventId};
// 1. Write to cache immediately
await redis.setex(cacheKey, 86400, JSON.stringify(event));

// 2. Add to write queue
this.writeQueue.add(eventId);

}
private async flush(): Promise<void> {
if (this.writeQueue.size === 0) return;
const eventIds = Array.from(this.writeQueue);
this.writeQueue.clear();

// Batch write to database
const events = await Promise.all(
  eventIds.map(id =&gt; redis.get(`event:${id}`))
);

await prisma.event.createMany({
  data: events
    .filter(e =&gt; e !== null)
    .map(e =&gt; JSON.parse(e!))
});

}
}

Warning: Write-behind caching risks data loss if cache fails before database write. Only use for non-critical data like analytics events.

Advanced Redis Patterns

Caching Lists with Pagination

async function getPostsFeed(userId: string, page: number = 1, limit: number = 20) {
  const cacheKey = `feed:${userId}`;
// Try to get from sorted set
const start = (page - 1) * limit;
const end = start + limit - 1;
const cached = await redis.zrevrange(cacheKey, start, end);
if (cached.length > 0) {
// Cache hit - fetch full post data
const posts = await Promise.all(
cached.map(id => getPost(id))
);
return posts;
}
// Cache miss - fetch from database
const posts = await prisma.post.findMany({
where: {
authorId: userId
},
orderBy: { createdAt: 'desc' },
take: 100 // Cache first 100 posts
});
// Store in sorted set with timestamps as scores
const pipeline = redis.pipeline();
posts.forEach(post => {
pipeline.zadd(
cacheKey,
post.createdAt.getTime(),
post.id
);
});
pipeline.expire(cacheKey, 3600);
await pipeline.exec();
return posts.slice(start, end + 1);
}

Caching with Tags for Invalidation

class CacheManager {
  async set(key: string, value: any, ttl: number, tags: string[] = []): Promise<void> {
    const pipeline = redis.pipeline();
// Store value
pipeline.setex(key, ttl, JSON.stringify(value));

// Store tags
tags.forEach(tag =&gt; {
  pipeline.sadd(`tag:${tag}`, key);
  pipeline.expire(`tag:${tag}`, ttl);
});

await pipeline.exec();

}
async invalidateTag(tag: string): Promise<void> {
// Get all keys with this tag
const keys = await redis.smembers(tag:${tag});
if (keys.length === 0) return;

// Delete all keys
const pipeline = redis.pipeline();
keys.forEach(key =&gt; pipeline.del(key));
pipeline.del(`tag:${tag}`);
await pipeline.exec();

console.log(`Invalidated ${keys.length} cache entries for tag: ${tag}`);

}
}
// Usage
const cacheManager = new CacheManager();
// Cache product with tags
await cacheManager.set(
'product:123',
product,
3600,
['products', 'category:electronics', 'brand:apple']
);
// Invalidate all products in a category
await cacheManager.invalidateTag('category:electronics');

Distributed Cache Lock Pattern

Prevent cache stampede - multiple requests fetching same data simultaneously:

class DistributedCache {
  async getOrSet<T>(
    key: string,
    ttl: number,
    fetchFunction: () => Promise<T>
  ): Promise<T> {
    // Try cache first
    const cached = await redis.get(key);
    if (cached) {
      return JSON.parse(cached);
    }
const lockKey = `lock:${key}`;
const lockTTL = 10; // 10 seconds

// Try to acquire lock
const acquired = await redis.set(
  lockKey,
  '1',
  'EX',
  lockTTL,
  'NX' // Only set if not exists
);

if (acquired) {
  try {
    // We have the lock - fetch data
    const data = await fetchFunction();

    // Store in cache
    await redis.setex(key, ttl, JSON.stringify(data));

    return data;
  } finally {
    // Release lock
    await redis.del(lockKey);
  }
} else {
  // Another process is fetching - wait and retry
  await new Promise(resolve =&gt; setTimeout(resolve, 100));
  return this.getOrSet(key, ttl, fetchFunction);
}

}
}
// Usage
const cache = new DistributedCache();
app.get('/api/expensive-data', async (req, res) => {
const data = await cache.getOrSet(
'expensive-data',
3600,
async () => {
// This expensive operation only runs once
// even with 1000 concurrent requests
return await performExpensiveOperation();
}
);
res.json(data);
});

HTTP Caching

HTTP caching headers enable browser and CDN caching without application code changes.

Cache-Control Headers

// Static assets - cache for 1 year
app.use('/static', express.static('public', {
  maxAge: '365d',
  immutable: true
}));
// API responses - private, must revalidate
app.get('/api/user/profile', async (req, res) => {
const profile = await getUserProfile(req.user.id);
res.set({
'Cache-Control': 'private, max-age=300, must-revalidate',
'Vary': 'Authorization'
});
res.json(profile);
});
// Public content - cache at CDN
app.get('/api/posts/:id', async (req, res) => {
const post = await getPost(req.params.id);
res.set({
'Cache-Control': 'public, max-age=3600, s-maxage=86400',
'Vary': 'Accept-Encoding'
});
res.json(post);
});

Cache-Control directives:

public: Can be cached by CDN and browsers
private: Only browser cache (not shared caches)
max-age=N: Cache for N seconds
s-maxage=N: CDN cache duration (overrides max-age)
must-revalidate: Must check with server after expiration
no-cache: Must revalidate before using cached response
no-store: Never cache (sensitive data)
immutable: Content never changes (versioned assets)

ETags for Conditional Requests

import crypto from 'crypto';
function generateETag(data: any): string {
return crypto
.createHash('md5')
.update(JSON.stringify(data))
.digest('hex');
}
app.get('/api/posts/:id', async (req, res) => {
const post = await getPost(req.params.id);
if (!post) {
return res.status(404).json({ error: 'Not found' });
}
const etag = generateETag(post);
// Check If-None-Match header
if (req.headers['if-none-match'] === etag) {
// Content hasn't changed
return res.status(304).end();
}
res.set({
'ETag': etag,
'Cache-Control': 'public, max-age=60'
});
res.json(post);
});

ETags reduce bandwidth by 70-90% when content hasn't changed, returning only 304 status without body.

Last-Modified Headers

app.get('/api/posts/:id', async (req, res) => {
  const post = await getPost(req.params.id);
if (!post) {
return res.status(404).json({ error: 'Not found' });
}
const lastModified = post.updatedAt.toUTCString();
// Check If-Modified-Since header
if (req.headers['if-modified-since'] === lastModified) {
return res.status(304).end();
}
res.set({
'Last-Modified': lastModified,
'Cache-Control': 'public, max-age=3600'
});
res.json(post);
});

CDN Edge Caching

CDNs cache content at edge locations globally, reducing latency by serving from the nearest location.

Cloudflare Workers Cache API

// Cloudflare Worker
export default {
  async fetch(request: Request): Promise<Response> {
    const cache = caches.default;
    const cacheKey = new Request(request.url, request);
// Check cache
let response = await cache.match(cacheKey);

if (response) {
  console.log('Cache hit at edge');
  return response;
}

// Cache miss - fetch from origin
response = await fetch(request);

// Don't cache errors
if (response.ok) {
  // Clone response before caching
  const responseToCache = response.clone();

  // Cache for 1 hour
  const headers = new Headers(responseToCache.headers);
  headers.set('Cache-Control', 'public, max-age=3600');

  const cachedResponse = new Response(
    responseToCache.body,
    {
      status: responseToCache.status,
      statusText: responseToCache.statusText,
      headers
    }
  );

  await cache.put(cacheKey, cachedResponse);
}

return response;

}
};

Stale-While-Revalidate

Serve stale content immediately while fetching fresh data in background:

app.get('/api/news', async (req, res) => {
  const news = await getLatestNews();
res.set({
// Cache for 1 minute, serve stale for 1 hour while revalidating
'Cache-Control': 'public, max-age=60, stale-while-revalidate=3600'
});
res.json(news);
});

Browsers and CDNs serve stale content instantly while fetching updates, providing instant loads with eventual consistency.

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton

Time-Based Expiration (TTL)

Simplest strategy - cache expires after fixed duration:

// Short TTL for frequently changing data
await redis.setex('trending:posts', 60, data); // 1 minute
// Medium TTL for semi-static data
await redis.setex('user:profile', 3600, data); // 1 hour
// Long TTL for rarely changing data
await redis.setex('config:settings', 86400, data); // 24 hours

Event-Based Invalidation

Invalidate cache when data changes:

class PostService {
  async update(postId: string, data: UpdatePostDto): Promise<Post> {
    const post = await prisma.post.update({
      where: { id: postId },
      data
    });
// Invalidate related caches
await this.invalidatePostCaches(postId, post.authorId);

return post;

}
private async invalidatePostCaches(postId: string, authorId: string): Promise<void> {
const pipeline = redis.pipeline();
// Invalidate specific post
pipeline.del(`post:${postId}`);

// Invalidate author's posts list
pipeline.del(`posts:author:${authorId}`);

// Invalidate trending posts if it was trending
pipeline.del('trending:posts');

// Invalidate tag-based caches
const post = await this.findById(postId);
post.tags.forEach(tag =&gt; {
  pipeline.del(`posts:tag:${tag}`);
});

await pipeline.exec();

}
}

Cache Versioning

Version cache keys to invalidate all related caches:

class VersionedCache {
  private async getVersion(namespace: string): Promise<number> {
    const version = await redis.get(`version:${namespace}`);
    return version ? parseInt(version) : 1;
  }
async set(namespace: string, key: string, value: any, ttl: number): Promise<void> {
const version = await this.getVersion(namespace);
const versionedKey = ${namespace}:v${version}:${key};
await redis.setex(versionedKey, ttl, JSON.stringify(value));
}
async get(namespace: string, key: string): Promise<any> {
const version = await this.getVersion(namespace);
const versionedKey = ${namespace}:v${version}:${key};
const data = await redis.get(versionedKey);
return data ? JSON.parse(data) : null;
}
async invalidateNamespace(namespace: string): Promise<void> {
// Increment version - all old keys become inaccessible
await redis.incr(version:${namespace});
}
}
// Usage
const cache = new VersionedCache();
await cache.set('products', '123', product, 3600);
// Later - invalidate all product caches
await cache.invalidateNamespace('products');

Pub/Sub for Distributed Invalidation

Notify all application servers when cache is invalidated:

// Publisher (when data changes)
async function invalidateProductCache(productId: string): Promise<void> {
  await redis.publish('cache:invalidate', JSON.stringify({
    type: 'product',
    id: productId
  }));
}
// Subscriber (on each application server)
const subscriber = new Redis();
subscriber.subscribe('cache:invalidate');
subscriber.on('message', async (channel, message) => {
const event = JSON.parse(message);
if (event.type === 'product') {
// Remove from local cache
await redis.del(product:${event.id});
console.log(Invalidated product cache: ${event.id});
}
});

Performance Optimization Patterns

Lazy Loading with Caching

class LazyCache<T> {
  private cache = new Map<string, Promise<T>>();
async get(key: string, loader: () => Promise<T>): Promise<T> {
if (this.cache.has(key)) {
return this.cache.get(key)!;
}
const promise = loader();
this.cache.set(key, promise);

try {
  const value = await promise;
  return value;
} catch (error) {
  // Remove failed promise from cache
  this.cache.delete(key);
  throw error;
}

}
clear(key?: string): void {
if (key) {
this.cache.delete(key);
} else {
this.cache.clear();
}
}
}
// Usage
const configCache = new LazyCache<Config>();
async function getConfig(): Promise<Config> {
return configCache.get('config', async () => {
console.log('Loading config from database...');
return await prisma.config.findFirst();
});
}

Multi-Level Caching

Combine in-memory and Redis caching for optimal performance:

import NodeCache from 'node-cache';
class MultiLevelCache {
private memoryCache = new NodeCache({ stdTTL: 60 });
async get<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
// L1: Check memory cache (fastest)
const memCached = this.memoryCache.get<T>(key);
if (memCached !== undefined) {
console.log(L1 cache hit: ${key});
return memCached;
}
// L2: Check Redis (fast)
const redisCached = await redis.get(key);
if (redisCached) {
  console.log(`L2 cache hit: ${key}`);
  const data = JSON.parse(redisCached);

  // Populate L1 cache
  this.memoryCache.set(key, data, 60);

  return data;
}

// L3: Load from source (slow)
console.log(`Cache miss: ${key}`);
const data = await loader();

// Populate both caches
this.memoryCache.set(key, data, 60);
await redis.setex(key, ttl, JSON.stringify(data));

return data;

}
async invalidate(key: string): Promise<void> {
this.memoryCache.del(key);
await redis.del(key);
}
}

Memory cache provides sub-millisecond access for hot data, Redis handles shared state across servers, reducing database load by 95%+.

Probabilistic Early Expiration

Prevent thundering herd by randomly expiring cache entries before TTL:

async function get<T>(
  key: string,
  ttl: number,
  loader: () => Promise<T>
): Promise<T> {
  const cached = await redis.get(key);
if (cached) {
const data = JSON.parse(cached);
// Get remaining TTL
const remainingTTL = await redis.ttl(key);

// Probabilistically refresh before expiration
// Higher probability as TTL approaches 0
const beta = 1.0;
const delta = remainingTTL;
const refreshProbability = Math.random() * ttl * beta &lt; delta;

if (refreshProbability) {
  // Asynchronously refresh cache
  loader().then(freshData =&gt; {
    redis.setex(key, ttl, JSON.stringify(freshData));
  });
}

return data;

}
// Cache miss - fetch and store
const data = await loader();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}

This prevents multiple processes from simultaneously refreshing expired cache, distributing load over time.

Real-World Examples

Reddit's Caching Architecture

Reddit uses multi-level caching:

Memcached: Primary cache for hot data (votes, comments)
Cassandra: Secondary cache for less frequently accessed data
CDN: Edge caching for static assets and public pages

Their caching strategy reduced database queries by 99%, handling 50M+ daily active users with fraction of database capacity needed without caching.

Instagram's Feed Caching

Instagram caches user feeds in Redis with sliding window:

Store last 1000 posts in sorted set (by timestamp)
Materialize feeds asynchronously when new posts are published
Serve first page from cache (instant load)
Fetch older posts on scroll (lazy loading)

This approach reduced feed load time from 800ms to 50ms while handling 2 billion daily active users.

Shopify's Fragment Caching

Shopify caches page fragments separately with different TTLs:

Product images: 1 year (immutable)
Product price/inventory: 5 minutes (changes frequently)
Product reviews: 1 hour (changes occasionally)
Page layout: 24 hours (rarely changes)

Fragment caching enabled 97% cache hit rate, serving Black Friday traffic (10M+ concurrent users) without database scaling.

Conclusion

Effective caching reduces response times by 90%+ while decreasing database load, enabling applications to scale horizontally with minimal cost increase. Implement multi-level caching with Redis for application data, HTTP headers for browser/CDN caching, and proper invalidation strategies to maintain data consistency.

Key patterns - cache-aside for simplicity, write-through for consistency, distributed locks for stampede prevention, and tag-based invalidation for complex dependencies - provide building blocks for any caching architecture. Monitor cache hit rates (target 90%+) and adjust TTLs based on data change frequency to maximize performance gains while maintaining freshness.