Caching

How it works

Caching Overview
How it works
  1. Check cache for key
  2. On miss, fetch from origin
  3. Populate cache with TTL
  4. Serve and monitor hit ratio

🎯 What is Caching?

Caching is like having a secret stash of frequently used ingredients in a restaurant - it allows you to serve your customers faster without always going back to the store. In computing, caching stores copies of files or data in temporary storage locations for quick access.
👤 Client Request
⬇️
🔍 Check Cache
⬇️
✅ Cache Hit
⬇️
📬 Serve from Cache

Overview

  • Speeds up reads by storing computed or fetched data closer to the requester.
  • Multiple tiers: browser, CDN, reverse proxy, app cache, DB cache.
  • Improves latency and reduces origin load when hit ratio is healthy.

When to use

  • Read-heavy workloads with temporal/spatial locality.
  • Expensive computations or slow upstreams where slight staleness is acceptable.
  • Content with natural versioning (e.g., assets) or query results reused across users.

Trade-offs

  • Staleness vs. freshness: TTL/validation governance needed.
  • Extra complexity: invalidation, key design, and observability.

Patterns

  • Cache-aside for apps managing their own misses.
  • Read-through for transparent loads; write-through for strong consistency.
  • Stale-while-revalidate and request coalescing to avoid stampedes.

Anti-patterns

  • Caching everything by default without TTL strategy.
  • Keys without namespacing/versioning, making broad invalidation impossible.
  • Large blob values that negate cache benefits due to transfer cost.

📚 Cache Levels

  1. Browser Cache: Client-side caching
  2. CDN: Geographic distribution of static content
  3. Reverse Proxy: Server-side caching (Nginx, Varnish)
  4. Application Cache: In-memory cache (Redis, Memcached)
  5. Database Cache: Query result caching

🛠️ Caching Strategies

🗂️ Cache-Aside (Lazy Loading)

Application manages cache

  • Load data on cache miss
  • Good for read-heavy workloads
  • Example: Product catalogs

📝 Write-Through

Write to cache and database simultaneously

  • Ensures consistency
  • Higher write latency
  • Example: User profiles

📥 Write-Back (Write-Behind)

Write to cache immediately, then to database asynchronously

  • Risk of data loss on failure
  • Higher performance for writes
  • Example: Logging, analytics

📖 Read-Through

Cache loads data automatically on miss

  • Transparent to application
  • Example: Configuration settings

📊 Cache Patterns

🌡️ Cache Warming

Pre-load cache with expected data

🚫 Cache Invalidation

Remove stale data from cache

⏳ TTL (Time To Live)

Automatic expiration of cache entries

🚧 Common Issues

  1. Cache Stampede: Multiple requests for the same expired key
  2. Hot Keys: Some keys accessed much more frequently
  3. Cache Penetration: Queries for non-existent data

🛠️ Solutions and Best Practices

🔄 Cache Aside Pattern

Application is responsible for loading data into the cache

📊 Monitoring and Alerts

Track cache hit/miss rates, evictions, and latency

🔧 Regular Cache Maintenance

Invalidate stale data, adjust TTLs, and optimize size

🧩 Real-World Scenarios

  • CDN image cache: Long TTL with versioned file names (hash in path) to make purges safe and fast.
  • API response cache: Product details cached for 60–300s to cut DB load; key by path+query and user scope.
  • Session/profile writes: Write-through for strong consistency; write-back for analytics or logs where loss is tolerable.
  • Search autosuggest: Read-through with short TTL and background refresh.
  • Microservice aggregator: Per-endpoint cache to shield upstream fan-out.

⚠️ Pitfalls and Anti-patterns

  • Thundering herd/stampede on popular key expiration.
  • Poor key design: missing namespace, collisions, or no version for schema changes.
  • Oversized values increase network and serialization cost; prefer smaller granular keys.
  • Inconsistent TTLs cause stale blends of data; use coherent policies and jitter to avoid synchronized expiry.
  • Eviction mismatch (e.g., LFU vs LRU) with traffic shape; monitor and adjust.
  • No negative caching: repeated misses for non-existent entries hammer the DB.

📐 Quick Diagrams


      # Cache-aside (lazy)
      App ──get(k)──▶ Cache ─miss──▶ DB
           ◀─hit────            ◀── value
           └─ set(k,v,ttl) ─────▶
      

      # Multi-layer (Edge + App + DB)
      Client ▶ CDN ▶ App ▶ Redis ▶ DB
      

      # Sharded Redis with consistent hashing
      keyHash = hash(key) % N
      

🧪 Operations Checklist

  • Track hit ratio, evictions, item size distribution, tail latency.
  • Size memory with headroom; set maxmemory-policy explicitly.
  • Use pipelining/batching for chatty patterns; compress large values.
  • Add jitter to TTLs; warm critical keys on deploys/cold starts.
  • Version cache keys to invalidate safely during schema changes.
  • For write-back, ensure durable queues and recovery on crash.

❓ Interview Q&A (concise)

  • Q: When not to cache? A: Highly dynamic, user-specific, or sensitive data; when recompute is cheap and correctness trumps speed.
  • Q: Prevent cache stampede? A: Request coalescing/single-flight, mutex per key, early refresh, jittered TTL, and stale-while-revalidate.
  • Q: Cache consistency options? A: Write-through, write-around, write-back; plus invalidation on writes and short TTLs.
  • Q: Negative caching? A: Cache “not found” briefly to avoid repeated DB hits for missing data.
  • Q: Redis vs Memcached? A: Redis = richer data types, persistence, clustering; Memcached = simple, fast KV in-memory.
  • Q: How to invalidate broadly? A: Versioned prefixes/namespaces rather than scanning keys.