Caching Strategies for High-Performance Applications - Production Implementation Guide
Master caching strategies with Redis, CDN edge caching, HTTP caching headers, application-level caching, cache invalidation patterns, and performance optimization for production applications.
Caching is the single most effective performance optimization technique, reducing database load by 80-95% while delivering sub-millisecond response times. This comprehensive guide covers production-ready caching strategies from application-level caching to CDN edge distribution, with real-world implementation patterns used by high-traffic applications.
Why Caching Matters
Performance Impact: Properly implemented caching reduces API response times from 200-500ms to 5-15ms, improving user experience and conversion rates.
Cost Reduction: Caching reduces database queries by 90%+, allowing applications to handle 10x more traffic with the same infrastructure.
Scalability: Cache-first architectures enable horizontal scaling, with cached content served from memory instead of expensive database operations.
Twitter's timeline cache reduced database load from 500,000 queries/second to 50,000 queries/second, saving millions in infrastructure costs while improving performance.
Cache Levels and Hierarchy
Modern applications implement caching at multiple levels:
1. Browser Cache
HTTP caching headers control client-side caching, eliminating network requests entirely.
2. CDN/Edge Cache
Content delivery networks cache static assets and API responses at edge locations globally.
3. Application Cache
In-memory caching (Redis, Memcached) stores frequently accessed data close to application servers.
4. Database Query Cache
Database engines cache query results and execution plans.
Redis for Application Caching
Redis provides in-memory key-value storage with millisecond latency, supporting complex data structures and atomic operations.
Basic Redis Caching Pattern
import Redis from 'ioredis';
import { promisify } from 'util';
const redis = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT || '6379'),
password: process.env.REDIS_PASSWORD,
retryStrategy: (times) => {
const delay = Math.min(times * 50, 2000);
return delay;
}
});
// Generic cache wrapper
async function cached<T>(
key: string,
ttl: number,
fetchFunction: () => Promise<T>
): Promise<T> {
// Try to get from cache
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - fetch data
const data = await fetchFunction();
// Store in cache with TTL
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
// Usage example
async function getUser(userId: string) {
return cached(
user:${userId},
3600, // 1 hour TTL
async () => {
const user = await prisma.user.findUnique({
where: { id: userId },
include: { profile: true }
});
return user;
}
);
}
// Express endpoint with caching
app.get('/api/users/:id', async (req, res) => {
const user = await getUser(req.params.id);
if (!user) {
return res.status(404).json({ error: 'User not found' });
}
res.json(user);
});
Cache-Aside Pattern
The most common caching pattern - application checks cache first, falls back to database on miss:
class UserService {
private readonly CACHE_TTL = 3600; // 1 hour
async findById(userId: string): Promise<User | null> {
const cacheKey = user:${userId};
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
console.log(`Cache hit: ${cacheKey}`);
return JSON.parse(cached);
}
console.log(`Cache miss: ${cacheKey}`);
// 2. Query database
const user = await prisma.user.findUnique({
where: { id: userId },
include: { profile: true, settings: true }
});
if (!user) {
return null;
}
// 3. Store in cache
await redis.setex(cacheKey, this.CACHE_TTL, JSON.stringify(user));
return user;
}
async update(userId: string, data: UpdateUserDto): Promise<User> {
// 1. Update database
const user = await prisma.user.update({
where: { id: userId },
data
});
// 2. Invalidate cache
await redis.del(`user:${userId}`);
return user;
}
}
Write-Through Cache
Data is written to cache and database simultaneously, ensuring cache is always up-to-date:
class ProductService {
async create(data: CreateProductDto): Promise<Product> {
// 1. Create in database
const product = await prisma.product.create({ data });
// 2. Write to cache immediately
const cacheKey = `product:${product.id}`;
await redis.setex(cacheKey, 3600, JSON.stringify(product));
return product;
}
async update(productId: string, data: UpdateProductDto): Promise<Product> {
// 1. Update database
const product = await prisma.product.update({
where: { id: productId },
data
});
// 2. Update cache
const cacheKey = `product:${productId}`;
await redis.setex(cacheKey, 3600, JSON.stringify(product));
return product;
}
}
Write-Behind (Write-Back) Cache
Writes go to cache first, then asynchronously to database - maximizes write performance:
class EventLogger {
private writeQueue: Set<string> = new Set();
private flushInterval: NodeJS.Timeout;
constructor() {
// Flush queue every 10 seconds
this.flushInterval = setInterval(() => this.flush(), 10000);
}
async logEvent(event: Event): Promise<void> {
const eventId = generateId();
const cacheKey = event:${eventId};
// 1. Write to cache immediately
await redis.setex(cacheKey, 86400, JSON.stringify(event));
// 2. Add to write queue
this.writeQueue.add(eventId);
}
private async flush(): Promise<void> {
if (this.writeQueue.size === 0) return;
const eventIds = Array.from(this.writeQueue);
this.writeQueue.clear();
// Batch write to database
const events = await Promise.all(
eventIds.map(id => redis.get(`event:${id}`))
);
await prisma.event.createMany({
data: events
.filter(e => e !== null)
.map(e => JSON.parse(e!))
});
}
}
Warning: Write-behind caching risks data loss if cache fails before database write. Only use for non-critical data like analytics events.
Advanced Redis Patterns
Caching Lists with Pagination
async function getPostsFeed(userId: string, page: number = 1, limit: number = 20) {
const cacheKey = `feed:${userId}`;
// Try to get from sorted set
const start = (page - 1) * limit;
const end = start + limit - 1;
const cached = await redis.zrevrange(cacheKey, start, end);
if (cached.length > 0) {
// Cache hit - fetch full post data
const posts = await Promise.all(
cached.map(id => getPost(id))
);
return posts;
}
// Cache miss - fetch from database
const posts = await prisma.post.findMany({
where: {
authorId: userId
},
orderBy: { createdAt: 'desc' },
take: 100 // Cache first 100 posts
});
// Store in sorted set with timestamps as scores
const pipeline = redis.pipeline();
posts.forEach(post => {
pipeline.zadd(
cacheKey,
post.createdAt.getTime(),
post.id
);
});
pipeline.expire(cacheKey, 3600);
await pipeline.exec();
return posts.slice(start, end + 1);
}
Caching with Tags for Invalidation
class CacheManager {
async set(key: string, value: any, ttl: number, tags: string[] = []): Promise<void> {
const pipeline = redis.pipeline();
// Store value
pipeline.setex(key, ttl, JSON.stringify(value));
// Store tags
tags.forEach(tag => {
pipeline.sadd(`tag:${tag}`, key);
pipeline.expire(`tag:${tag}`, ttl);
});
await pipeline.exec();
}
async invalidateTag(tag: string): Promise<void> {
// Get all keys with this tag
const keys = await redis.smembers(tag:${tag});
if (keys.length === 0) return;
// Delete all keys
const pipeline = redis.pipeline();
keys.forEach(key => pipeline.del(key));
pipeline.del(`tag:${tag}`);
await pipeline.exec();
console.log(`Invalidated ${keys.length} cache entries for tag: ${tag}`);
}
}
// Usage
const cacheManager = new CacheManager();
// Cache product with tags
await cacheManager.set(
'product:123',
product,
3600,
['products', 'category:electronics', 'brand:apple']
);
// Invalidate all products in a category
await cacheManager.invalidateTag('category:electronics');
Distributed Cache Lock Pattern
Prevent cache stampede - multiple requests fetching same data simultaneously:
class DistributedCache {
async getOrSet<T>(
key: string,
ttl: number,
fetchFunction: () => Promise<T>
): Promise<T> {
// Try cache first
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
const lockKey = `lock:${key}`;
const lockTTL = 10; // 10 seconds
// Try to acquire lock
const acquired = await redis.set(
lockKey,
'1',
'EX',
lockTTL,
'NX' // Only set if not exists
);
if (acquired) {
try {
// We have the lock - fetch data
const data = await fetchFunction();
// Store in cache
await redis.setex(key, ttl, JSON.stringify(data));
return data;
} finally {
// Release lock
await redis.del(lockKey);
}
} else {
// Another process is fetching - wait and retry
await new Promise(resolve => setTimeout(resolve, 100));
return this.getOrSet(key, ttl, fetchFunction);
}
}
}
// Usage
const cache = new DistributedCache();
app.get('/api/expensive-data', async (req, res) => {
const data = await cache.getOrSet(
'expensive-data',
3600,
async () => {
// This expensive operation only runs once
// even with 1000 concurrent requests
return await performExpensiveOperation();
}
);
res.json(data);
});
HTTP Caching
HTTP caching headers enable browser and CDN caching without application code changes.
Cache-Control Headers
// Static assets - cache for 1 year
app.use('/static', express.static('public', {
maxAge: '365d',
immutable: true
}));
// API responses - private, must revalidate
app.get('/api/user/profile', async (req, res) => {
const profile = await getUserProfile(req.user.id);
res.set({
'Cache-Control': 'private, max-age=300, must-revalidate',
'Vary': 'Authorization'
});
res.json(profile);
});
// Public content - cache at CDN
app.get('/api/posts/:id', async (req, res) => {
const post = await getPost(req.params.id);
res.set({
'Cache-Control': 'public, max-age=3600, s-maxage=86400',
'Vary': 'Accept-Encoding'
});
res.json(post);
});
Cache-Control directives:
public: Can be cached by CDN and browsersprivate: Only browser cache (not shared caches)max-age=N: Cache for N secondss-maxage=N: CDN cache duration (overrides max-age)must-revalidate: Must check with server after expirationno-cache: Must revalidate before using cached responseno-store: Never cache (sensitive data)immutable: Content never changes (versioned assets)
ETags for Conditional Requests
import crypto from 'crypto';
function generateETag(data: any): string {
return crypto
.createHash('md5')
.update(JSON.stringify(data))
.digest('hex');
}
app.get('/api/posts/:id', async (req, res) => {
const post = await getPost(req.params.id);
if (!post) {
return res.status(404).json({ error: 'Not found' });
}
const etag = generateETag(post);
// Check If-None-Match header
if (req.headers['if-none-match'] === etag) {
// Content hasn't changed
return res.status(304).end();
}
res.set({
'ETag': etag,
'Cache-Control': 'public, max-age=60'
});
res.json(post);
});
ETags reduce bandwidth by 70-90% when content hasn't changed, returning only 304 status without body.
Last-Modified Headers
app.get('/api/posts/:id', async (req, res) => {
const post = await getPost(req.params.id);
if (!post) {
return res.status(404).json({ error: 'Not found' });
}
const lastModified = post.updatedAt.toUTCString();
// Check If-Modified-Since header
if (req.headers['if-modified-since'] === lastModified) {
return res.status(304).end();
}
res.set({
'Last-Modified': lastModified,
'Cache-Control': 'public, max-age=3600'
});
res.json(post);
});
CDN Edge Caching
CDNs cache content at edge locations globally, reducing latency by serving from the nearest location.
Cloudflare Workers Cache API
// Cloudflare Worker
export default {
async fetch(request: Request): Promise<Response> {
const cache = caches.default;
const cacheKey = new Request(request.url, request);
// Check cache
let response = await cache.match(cacheKey);
if (response) {
console.log('Cache hit at edge');
return response;
}
// Cache miss - fetch from origin
response = await fetch(request);
// Don't cache errors
if (response.ok) {
// Clone response before caching
const responseToCache = response.clone();
// Cache for 1 hour
const headers = new Headers(responseToCache.headers);
headers.set('Cache-Control', 'public, max-age=3600');
const cachedResponse = new Response(
responseToCache.body,
{
status: responseToCache.status,
statusText: responseToCache.statusText,
headers
}
);
await cache.put(cacheKey, cachedResponse);
}
return response;
}
};
Stale-While-Revalidate
Serve stale content immediately while fetching fresh data in background:
app.get('/api/news', async (req, res) => {
const news = await getLatestNews();
res.set({
// Cache for 1 minute, serve stale for 1 hour while revalidating
'Cache-Control': 'public, max-age=60, stale-while-revalidate=3600'
});
res.json(news);
});
Browsers and CDNs serve stale content instantly while fetching updates, providing instant loads with eventual consistency.
Cache Invalidation Strategies
"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton
Time-Based Expiration (TTL)
Simplest strategy - cache expires after fixed duration:
// Short TTL for frequently changing data
await redis.setex('trending:posts', 60, data); // 1 minute
// Medium TTL for semi-static data
await redis.setex('user:profile', 3600, data); // 1 hour
// Long TTL for rarely changing data
await redis.setex('config:settings', 86400, data); // 24 hours
Event-Based Invalidation
Invalidate cache when data changes:
class PostService {
async update(postId: string, data: UpdatePostDto): Promise<Post> {
const post = await prisma.post.update({
where: { id: postId },
data
});
// Invalidate related caches
await this.invalidatePostCaches(postId, post.authorId);
return post;
}
private async invalidatePostCaches(postId: string, authorId: string): Promise<void> {
const pipeline = redis.pipeline();
// Invalidate specific post
pipeline.del(`post:${postId}`);
// Invalidate author's posts list
pipeline.del(`posts:author:${authorId}`);
// Invalidate trending posts if it was trending
pipeline.del('trending:posts');
// Invalidate tag-based caches
const post = await this.findById(postId);
post.tags.forEach(tag => {
pipeline.del(`posts:tag:${tag}`);
});
await pipeline.exec();
}
}
Cache Versioning
Version cache keys to invalidate all related caches:
class VersionedCache {
private async getVersion(namespace: string): Promise<number> {
const version = await redis.get(`version:${namespace}`);
return version ? parseInt(version) : 1;
}
async set(namespace: string, key: string, value: any, ttl: number): Promise<void> {
const version = await this.getVersion(namespace);
const versionedKey = ${namespace}:v${version}:${key};
await redis.setex(versionedKey, ttl, JSON.stringify(value));
}
async get(namespace: string, key: string): Promise<any> {
const version = await this.getVersion(namespace);
const versionedKey = ${namespace}:v${version}:${key};
const data = await redis.get(versionedKey);
return data ? JSON.parse(data) : null;
}
async invalidateNamespace(namespace: string): Promise<void> {
// Increment version - all old keys become inaccessible
await redis.incr(version:${namespace});
}
}
// Usage
const cache = new VersionedCache();
await cache.set('products', '123', product, 3600);
// Later - invalidate all product caches
await cache.invalidateNamespace('products');
Pub/Sub for Distributed Invalidation
Notify all application servers when cache is invalidated:
// Publisher (when data changes)
async function invalidateProductCache(productId: string): Promise<void> {
await redis.publish('cache:invalidate', JSON.stringify({
type: 'product',
id: productId
}));
}
// Subscriber (on each application server)
const subscriber = new Redis();
subscriber.subscribe('cache:invalidate');
subscriber.on('message', async (channel, message) => {
const event = JSON.parse(message);
if (event.type === 'product') {
// Remove from local cache
await redis.del(product:${event.id});
console.log(Invalidated product cache: ${event.id});
}
});
Performance Optimization Patterns
Lazy Loading with Caching
class LazyCache<T> {
private cache = new Map<string, Promise<T>>();
async get(key: string, loader: () => Promise<T>): Promise<T> {
if (this.cache.has(key)) {
return this.cache.get(key)!;
}
const promise = loader();
this.cache.set(key, promise);
try {
const value = await promise;
return value;
} catch (error) {
// Remove failed promise from cache
this.cache.delete(key);
throw error;
}
}
clear(key?: string): void {
if (key) {
this.cache.delete(key);
} else {
this.cache.clear();
}
}
}
// Usage
const configCache = new LazyCache<Config>();
async function getConfig(): Promise<Config> {
return configCache.get('config', async () => {
console.log('Loading config from database...');
return await prisma.config.findFirst();
});
}
Multi-Level Caching
Combine in-memory and Redis caching for optimal performance:
import NodeCache from 'node-cache';
class MultiLevelCache {
private memoryCache = new NodeCache({ stdTTL: 60 });
async get<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
// L1: Check memory cache (fastest)
const memCached = this.memoryCache.get<T>(key);
if (memCached !== undefined) {
console.log(L1 cache hit: ${key});
return memCached;
}
// L2: Check Redis (fast)
const redisCached = await redis.get(key);
if (redisCached) {
console.log(`L2 cache hit: ${key}`);
const data = JSON.parse(redisCached);
// Populate L1 cache
this.memoryCache.set(key, data, 60);
return data;
}
// L3: Load from source (slow)
console.log(`Cache miss: ${key}`);
const data = await loader();
// Populate both caches
this.memoryCache.set(key, data, 60);
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
async invalidate(key: string): Promise<void> {
this.memoryCache.del(key);
await redis.del(key);
}
}
Memory cache provides sub-millisecond access for hot data, Redis handles shared state across servers, reducing database load by 95%+.
Probabilistic Early Expiration
Prevent thundering herd by randomly expiring cache entries before TTL:
async function get<T>(
key: string,
ttl: number,
loader: () => Promise<T>
): Promise<T> {
const cached = await redis.get(key);
if (cached) {
const data = JSON.parse(cached);
// Get remaining TTL
const remainingTTL = await redis.ttl(key);
// Probabilistically refresh before expiration
// Higher probability as TTL approaches 0
const beta = 1.0;
const delta = remainingTTL;
const refreshProbability = Math.random() * ttl * beta < delta;
if (refreshProbability) {
// Asynchronously refresh cache
loader().then(freshData => {
redis.setex(key, ttl, JSON.stringify(freshData));
});
}
return data;
}
// Cache miss - fetch and store
const data = await loader();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
This prevents multiple processes from simultaneously refreshing expired cache, distributing load over time.
Real-World Examples
Reddit's Caching Architecture
Reddit uses multi-level caching:
- Memcached: Primary cache for hot data (votes, comments)
- Cassandra: Secondary cache for less frequently accessed data
- CDN: Edge caching for static assets and public pages
Their caching strategy reduced database queries by 99%, handling 50M+ daily active users with fraction of database capacity needed without caching.
Instagram's Feed Caching
Instagram caches user feeds in Redis with sliding window:
- Store last 1000 posts in sorted set (by timestamp)
- Materialize feeds asynchronously when new posts are published
- Serve first page from cache (instant load)
- Fetch older posts on scroll (lazy loading)
This approach reduced feed load time from 800ms to 50ms while handling 2 billion daily active users.
Shopify's Fragment Caching
Shopify caches page fragments separately with different TTLs:
- Product images: 1 year (immutable)
- Product price/inventory: 5 minutes (changes frequently)
- Product reviews: 1 hour (changes occasionally)
- Page layout: 24 hours (rarely changes)
Fragment caching enabled 97% cache hit rate, serving Black Friday traffic (10M+ concurrent users) without database scaling.
Conclusion
Effective caching reduces response times by 90%+ while decreasing database load, enabling applications to scale horizontally with minimal cost increase. Implement multi-level caching with Redis for application data, HTTP headers for browser/CDN caching, and proper invalidation strategies to maintain data consistency.
Key patterns - cache-aside for simplicity, write-through for consistency, distributed locks for stampede prevention, and tag-based invalidation for complex dependencies - provide building blocks for any caching architecture. Monitor cache hit rates (target 90%+) and adjust TTLs based on data change frequency to maximize performance gains while maintaining freshness.
Related Articles
GraphQL API Design - Production Architecture and Best Practices for Scalable Systems
Master GraphQL API design covering schema design principles, resolver optimization, N+1 query prevention with DataLoader, authentication and authorization patterns, caching strategies, error handling, and production deployment for high-performance GraphQL systems.
Testing Strategies - Unit, Integration, and E2E Testing Best Practices for Production Quality
Comprehensive guide to testing strategies covering unit tests, integration tests, end-to-end testing, test-driven development, mocking patterns, testing pyramid, and production testing practices for reliable software delivery.
Monitoring and Observability - Production Systems Performance and Debugging at Scale
Master monitoring and observability covering metrics collection with Prometheus, distributed tracing with OpenTelemetry, log aggregation, alerting strategies, SLOs/SLIs, and production debugging techniques for reliable systems.
Written by StaticBlock Editorial
StaticBlock Editorial is a technical writer and software engineer specializing in web development, performance optimization, and developer tooling.