Caching

How it works

🎯 What is Caching?

Caching is like having a secret stash of frequently used ingredients in a restaurant - it allows you to serve your customers faster without always going back to the store. In computing, caching stores copies of files or data in temporary storage locations for quick access.

👤 Client Request

⬇️

🔍 Check Cache

⬇️

✅ Cache Hit

⬇️

📬 Serve from Cache

Overview

Speeds up reads by storing computed or fetched data closer to the requester.
Multiple tiers: browser, CDN, reverse proxy, app cache, DB cache.
Improves latency and reduces origin load when hit ratio is healthy.

When to use

Read-heavy workloads with temporal/spatial locality.
Expensive computations or slow upstreams where slight staleness is acceptable.
Content with natural versioning (e.g., assets) or query results reused across users.

Trade-offs

Staleness vs. freshness: TTL/validation governance needed.
Extra complexity: invalidation, key design, and observability.

Patterns

Cache-aside for apps managing their own misses.
Read-through for transparent loads; write-through for strong consistency.
Stale-while-revalidate and request coalescing to avoid stampedes.

Anti-patterns

Caching everything by default without TTL strategy.
Keys without namespacing/versioning, making broad invalidation impossible.
Large blob values that negate cache benefits due to transfer cost.

📚 Cache Levels

Browser Cache: Client-side caching
CDN: Geographic distribution of static content
Reverse Proxy: Server-side caching (Nginx, Varnish)
Application Cache: In-memory cache (Redis, Memcached)
Database Cache: Query result caching

🛠️ Caching Strategies

🗂️ Cache-Aside (Lazy Loading)

Application manages cache

Load data on cache miss
Good for read-heavy workloads
Example: Product catalogs

📝 Write-Through

Write to cache and database simultaneously

Ensures consistency
Higher write latency
Example: User profiles

📥 Write-Back (Write-Behind)

Write to cache immediately, then to database asynchronously

Risk of data loss on failure
Higher performance for writes
Example: Logging, analytics

📖 Read-Through

Cache loads data automatically on miss

Transparent to application
Example: Configuration settings

📊 Cache Patterns

🌡️ Cache Warming

Pre-load cache with expected data

🚫 Cache Invalidation

Remove stale data from cache

⏳ TTL (Time To Live)

Automatic expiration of cache entries

🚧 Common Issues

Cache Stampede: Multiple requests for the same expired key
Hot Keys: Some keys accessed much more frequently
Cache Penetration: Queries for non-existent data

🛠️ Solutions and Best Practices

🔄 Cache Aside Pattern

Application is responsible for loading data into the cache

📊 Monitoring and Alerts

Track cache hit/miss rates, evictions, and latency

🔧 Regular Cache Maintenance

Invalidate stale data, adjust TTLs, and optimize size

🧩 Real-World Scenarios

CDN image cache: Long TTL with versioned file names (hash in path) to make purges safe and fast.
API response cache: Product details cached for 60–300s to cut DB load; key by path+query and user scope.
Session/profile writes: Write-through for strong consistency; write-back for analytics or logs where loss is tolerable.
Search autosuggest: Read-through with short TTL and background refresh.
Microservice aggregator: Per-endpoint cache to shield upstream fan-out.

⚠️ Pitfalls and Anti-patterns

Thundering herd/stampede on popular key expiration.
Poor key design: missing namespace, collisions, or no version for schema changes.
Oversized values increase network and serialization cost; prefer smaller granular keys.
Inconsistent TTLs cause stale blends of data; use coherent policies and jitter to avoid synchronized expiry.
Eviction mismatch (e.g., LFU vs LRU) with traffic shape; monitor and adjust.
No negative caching: repeated misses for non-existent entries hammer the DB.

📐 Quick Diagrams


      # Cache-aside (lazy)
      App ──get(k)──▶ Cache ─miss──▶ DB
           ◀─hit────            ◀── value
           └─ set(k,v,ttl) ─────▶


      # Multi-layer (Edge + App + DB)
      Client ▶ CDN ▶ App ▶ Redis ▶ DB


      # Sharded Redis with consistent hashing
      keyHash = hash(key) % N

🧪 Operations Checklist

Track hit ratio, evictions, item size distribution, tail latency.
Size memory with headroom; set maxmemory-policy explicitly.
Use pipelining/batching for chatty patterns; compress large values.
Add jitter to TTLs; warm critical keys on deploys/cold starts.
Version cache keys to invalidate safely during schema changes.
For write-back, ensure durable queues and recovery on crash.

❓ Interview Q&A (concise)

Q: When not to cache? A: Highly dynamic, user-specific, or sensitive data; when recompute is cheap and correctness trumps speed.
Q: Prevent cache stampede? A: Request coalescing/single-flight, mutex per key, early refresh, jittered TTL, and stale-while-revalidate.
Q: Cache consistency options? A: Write-through, write-around, write-back; plus invalidation on writes and short TTLs.
Q: Negative caching? A: Cache “not found” briefly to avoid repeated DB hits for missing data.
Q: Redis vs Memcached? A: Redis = richer data types, persistence, clustering; Memcached = simple, fast KV in-memory.
Q: How to invalidate broadly? A: Versioned prefixes/namespaces rather than scanning keys.