The Unpopular Database That Built Billion-Dollar Companies

Sebastian Heinzer
11 Min Read

Every few years, the database conversation resets. A new wave of teams decides the old relational engines are too limited, too centralized, too boring for modern scale. Then you look at the systems that actually survived hypergrowth, painful migrations, compliance pressure, and 3 a.m. incidents, and the pattern is less fashionable. Many billion-dollar companies did not win because they found a magical datastore. They won because they squeezed extraordinary leverage out of a boring one, then surrounded it with careful engineering.

That is the unpopular choice, not one brand name, but a philosophy. Start with a relational core, usually MySQL or PostgreSQL, keep the transactional boundary simple, and push complexity outward only when the workload proves you must. GitHub began as a Ruby on Rails app with a single MySQL database and continued to rely on MySQL for critical parts of GitHub.com. Shopify scaled a sharded monolith instead of pretending the monolith itself was the problem. Stripe built its money movement systems around correctness, not novelty.

1. They treated the database as the system of record, not a throughput contest

The unpopular move is to let the primary relational database stay important for longer than architecture Twitter thinks is respectable. That sounds conservative until you have to reconcile money, permissions, inventory, or identity under failure. In those domains, consistency is not a nice-to-have, it is the product. Stripe’s engineering work on Ledger reflects that mindset, with explicit tracking and validation of money movement across a massive payments network. When your company lives or dies on correctness, the cost of a slower write path is often lower than the cost of explaining irreversible inconsistency to customers, auditors, or regulators.

Senior engineers usually learn this the hard way. The first scaling bottleneck is rarely the database engine by itself. It is ambiguous ownership of data, weak transaction boundaries, or product teams writing around consistency because the architecture made it inconvenient. A boring relational core forces discipline. That discipline compounds.

2. They delayed distribution until the workload, not ideology, demanded it

A lot of teams distribute too early because distributed systems feel like maturity. In practice, they often buy coordination overhead before they buy meaningful scale. GitHub is a useful counterexample. Its engineering evolution moved from a single MySQL database to multiple MySQL clusters with classic primary-replica patterns, high availability mechanisms, and careful operational controls. That is not anti-scale. It is scale with sequencing.

See also  Why Scalable MVPs Fail With Real Users

This matters because every distributed boundary becomes a new incident boundary. Once you split data across services or regions, you inherit cross-system retries, replay semantics, stale reads, partial failures, and harder debugging. The unpopular database choice is often just the choice to keep those problems out of your architecture until they generate more value than pain. Billion-dollar companies tend to be better at timing than at trend-chasing.

3. They used caches, replicas, and queues as relief valves instead of replacing the core

The mature pattern is not “database or scale.” It is “database plus pressure-management layers.” GitHub serves reads from MySQL replicas. Instagram has written extensively about using caching and precomputation in ranking systems, and about running one of the world’s largest Django deployments. Shopify has used change data capture and Kafka-based pipelines around its sharded core datastore. None of those examples suggest the relational database disappeared. They show what real scaling usually looks like, keep the source of truth tight, then let adjacent systems absorb read amplification, asynchronous workflows, and analytical fan-out.

This is where many architecture debates get distorted. Teams compare a single database to a fully distributed stack as if the only two options are “do nothing” and “rewrite the world.” Production systems do not scale that way. They scale by adding selective escape hatches around a stable core. That is less glamorous, but it is how you preserve debuggability while buying time.

4. They accepted sharding as an operational problem, not a product identity

Sharding is often sold as proof that your architecture has reached adulthood. More often, it is a tax you reluctantly pay after the business succeeds. Shopify’s engineering organization has been unusually candid here. It has described database shard balancing at terabyte scale, a sharded monolith, and a broader pods architecture designed to isolate failures so one hot shard does not become a platform-wide outage. The lesson is subtle and important, sharding can work extremely well, but it is not simplification. It is operational specialization.

See also  API Scaling: Vertical vs Horizontal Tradeoffs

That distinction matters for technical leaders. If your plan to scale depends on sharding next quarter, you are not really planning to scale, you are planning to take on a new class of toil. Routing, rebalancing, backfills, tenant movement, hot-spot mitigation, and consistency bugs do not care whether your architecture diagram looks modern. The unpopular choice is to postpone that bill until revenue, load shape, and organizational maturity justify paying it.

5. They prized schema rigor because product velocity depends on data clarity

Engineers sometimes talk about schemas as if they are friction. In reality, weak data contracts are what slow teams down after the first year. A relational system gives you constraints, joins, migrations, and query semantics that are annoyingly strict right up until the day you need to answer a business-critical question during an incident or compliance review. GitHub’s continued investment in MySQL upgrades and high availability is a quiet endorsement of this model. Teams do not pour that level of effort into a datastore they see as disposable.

For product organizations, this often becomes a compounding advantage. A clean relational model can support faster iteration because developers share a common language for how entities relate. You spend less time reverse-engineering accidental denormalization and more time shipping. There are workloads where wide-column, document, or event-native models are the right answer. But for core business systems, schema discipline is frequently the thing that keeps speed from collapsing into entropy.

6. They optimized for incident legibility, not theoretical elegance

One reason boring databases keep showing up inside very large companies is that incidents punish abstraction. During an outage, you need to know where truth lives, how writes flow, which replicas are safe, and what rollback even means. A single well-understood relational core, even one surrounded by replicas, caches, and asynchronous pipelines, is often easier to reason about than a stack of fashionable stores with overlapping responsibilities. GitHub has described MySQL availability as critical to the site, API, and authentication. Shopify has described isolation specifically to prevent failures from spiraling across the platform. Those are incident-response values expressed as architecture.

See also  Build vs. Buy: 7 Signals to Do Less and Win More

This is not nostalgia. It is operational realism. The database choice that built billion-dollar companies was unpopular mainly because it refused to flatter engineers. It asked for careful indexing, painful migrations, explicit ownership, and respect for state. Those are not exciting design principles. They are the ones that survive contact with production.

7. They knew the real moat was engineering judgment, not the engine

The deepest lesson is that MySQL and PostgreSQL were never the whole strategy. The strategy was knowing when not to abstract, when not to shard, when not to split the monolith, and when correctness mattered more than architectural aesthetics. That is why the same “boring” database family can sit under very different outcomes. The engine matters, but the operational judgment around it matters more. Shopify, GitHub, Instagram, and Stripe all layered different scaling patterns on top of conservative data foundations because their workloads demanded it, not because a conference talk did.

If you are making the call today, the useful question is not whether relational databases are cool again. It is whether your system really needs a more exotic failure mode. Most teams do not have a database problem. They have a workload-shaping problem, a data-modeling problem, or an operational-discipline problem. A boring relational core will expose that truth quickly. That is exactly why it still wins.

The unpopular database choice was never about playing it safe. It was about putting complexity where you can see it, test it, and carry it operationally. That is how billion-dollar companies bought time for real scale, and why many senior engineers still reach for the boring option first. Not because it solves everything, but because it fails in ways competent teams can understand and improve.

Share This Article
Sebastian is a news contributor at Technori. He writes on technology, business, and trending topics. He is an expert in emerging companies.