Social News Feed
Designing a Twitter/Facebook-style personalized feed.
Learning Objectives
By the end of this case study, you will understand:
- Design feed generation strategies: fanout-on-write vs fanout-on-read
- Implement timeline algorithms and content ranking systems
- Build scalable social graph storage and traversal systems
- Handle celebrity/influencer scaling challenges with hybrid approaches
- Design real-time content delivery with eventual consistency trade-offs
Real-World Examples
Facebook: Serves 2+ billion users with personalized feeds, 4+ petabytes of new data daily
Twitter: Processes 6,000+ tweets per second, delivers personalized timelines to 450+ million users
Instagram: Handles 95+ million photos/videos daily with ML-powered algorithmic feeds
LinkedIn: Powers professional feeds for 900+ million users with engagement-based ranking
Write path + fanout + ranking + caching
Requirements
- Personalized, ranked timeline per user
- Low-latency reads; freshness with eventual consistency
- Infinite scroll; pagination and caching
High-Level Design
- Write path: store posts, edges (follows), engagement signals
- Fanout-on-write for small users; fanout-on-read for large/fat edges
- Ranking service using offline features + online signals
- Cache hot timelines; background refresh
Capacity & Sizing
- Follower distribution heavy-tail: isolate mega accounts
- Posts/sec during peak; timeline storage per user per day
- Cache budget per user; eviction policy impacts hit ratio
Key Components
- Write Service, Edge store, Timeline store
- Ranking service (features + online scoring)
- Cache (edge and app), Priority queues for recompute
Non-functional Requirements
- P95 timeline fetch < 200ms
- Freshness: new posts reflected within seconds
- High availability for read path
Data Model
Posts, edges, timelines, and engagement
- posts (
post_idPK,author_id,ts,body) - edges (
follower_id,followee_id) - timeline (
user_id,post_id,score,ts) - engagement (
post_id,user_id,like,comment,share,ts)
APIs
- Publish:
POST /api/postswith body{ body } - Timeline:
GET /api/timeline?user=:id&cursor=... - Engagement:
POST /api/posts/:id/like,POST /api/posts/:id/comment
Hot Path
- Read timeline: candidate fetch → rank → serve from cache
- Publish: write → fanout (hybrid) → cache invalidate
Read Flow
- Fetch candidate posts (precomputed or on-demand)
- Rank with recency + engagement features
- Serve from cache; hydrate missing details lazily
Ranking Strategy
- Offline: feature generation (author quality, interaction graph)
- Online: recency boost, personalization, diversity constraints
- A/B testing of ranking weights and models
Scaling
- Hybrid fanout model based on follower count thresholds
- Cache per-user timelines; invalidate on new posts/engagement
- Async recompute with priority queues for heavy users
Caching & TTL
- Short TTL cache per user (seconds) with jitter to avoid thundering herd
- Background warm for active users; invalidate on new posts
Trade-offs
- Storage for precomputed timelines vs read-time cost
- Staleness windows vs freshness guarantees
- Cold-start ranking for new users
Failure Modes & Mitigations
- Ranking outage → serve recency-only fallback
- Cache miss storms → stagger refresh and add jitter
- Skew from mega influencers → isolate and throttle
Observability
- SLIs: p95 read latency, cache hit ratio, ranking error rate
- Event tracing from publish to fanout to timeline read
- Dashboards: queue depths, fanout lag, cache evictions
Implementation Notes
- Use hybrid fanout: pre-compute for normal users, pull-on-demand for celebrities
- Implement content ranking with features like recency, engagement, user relationship
- Design efficient pagination with cursor-based tokens to handle real-time updates
- Use graph databases or denormalized tables for fast social graph traversal
- Implement ML models for content relevance and engagement prediction
Best Practices
- Design for feed diversity: avoid filter bubbles with content source diversification
- Implement graceful degradation: serve cached/older content during peak loads
- Use A/B testing frameworks for ranking algorithm improvements
- Design content freshness controls with configurable staleness windows
- Implement comprehensive engagement tracking for ranking signal feedback loops
Common Pitfalls
- Not handling celebrity users properly - can overwhelm fanout systems
- Poor ranking algorithm leading to echo chambers and reduced engagement
- Inefficient social graph storage causing slow timeline generation
- Not implementing proper caching invalidation for real-time updates
- Insufficient monitoring of content freshness and ranking quality metrics