Choosing a database isn’t about finding the “best” technology; it’s about matching a database’s strengths to your specific access patterns. In a recent architectural deep dive, the channel ByteMonk explained why three of the world’s largest platforms chose fundamentally different database paths. [00:23]
1. Netflix: High-Write Throughput with Cassandra
Netflix handles over 3 million writes per second as it tracks every pause, hover, and search from 260 million subscribers. [01:20]
- The Choice: Apache Cassandra.
- Why: Cassandra is built for horizontal write scaling. It acts like a distributed hashmap, routing writes to specific nodes with minimal overhead. [01:46]
- The Trade-off: No joins or ad-hoc SQL queries. Netflix must model its data around specific queries rather than entities, often duplicating data across different tables to ensure every read is a simple key lookup. [02:27]
2. Instagram: Relational Complexity with PostgreSQL
Instagram’s core workload is read-heavy and highly relational. Feeds require joining posts with follow relationships, and profiles need aggregated counts. [03:35]
- The Choice: PostgreSQL.
- Why: PostgreSQL excels at joins, aggregations, and complex filtering. Instagram proved that you don’t need NoSQL just because you have a billion users; you can scale SQL using connection pooling (PG Bouncer), read replicas, and partitioning. [05:07]
- The Trade-off: Massive write volumes are harder to handle than in Cassandra. Instagram accepts the engineering complexity of sharding and indexing to keep the flexibility of relational queries. [05:32]
3. Twitter: Ultra-Low Latency with Redis
Twitter’s challenge is the timeline. When you open the app, you expect to see a merged list of tweets from thousands of accounts instantly. [06:09]
- The Choice: Redis (as a cache).
- Why: Redis operates entirely in memory, serving precomputed timelines at 300,000 requests per second. Twitter uses a “Fan-out on Write” approach, pushing new tweets into the Redis caches of every follower so the timeline is already assembled when the user logs in. [07:01]
- The Trade-off: Redis is not durable and can lose data on restart. Twitter uses it only as a cache, with a durable database (like Manhattan or MySQL) as the primary source of truth. [07:31]
How to Choose Your Database
To make the right choice, ask yourself three questions: [08:58]
- What is the access pattern? Relational queries (Postgres), massive writes (Cassandra), or ultra-low latency reads (Redis)?
- What are you willing to sacrifice? Flexibility, write scale, or data durability?
- Do you actually need it? Most apps don’t need a distributed NoSQL system; a well-indexed Postgres instance can handle 95% of use cases. [09:41]

Leave a Reply