System Design Explained with a Cricket Stadium
Picture this: it's an India vs Pakistan match. 100,000 fans inside the stadium. Millions watching the live stream. Everyone refreshing the scoreboard at the same time.
How does the system not collapse?
The answer lies in concepts that engineers have been using for decades. And the best part — you don't need a computer science degree to understand them. Let's walk through it using a cricket stadium as our guide.
The Setup: One Ticket Counter, One Match
You're running a small local cricket tournament. There's one ticket counter, one person handling all bookings. Ten people show up — no problem.
But then your tournament gets popular. Hundreds of fans line up. Your one counter person is overwhelmed. Tickets stop selling. Fans leave. You're losing money.
Something has to change.
1. Vertical Scaling — Make the Counter Faster
The first instinct? Get your counter person a faster computer, a better billing system, a second screen. Same person, but now they can process bookings twice as fast.
In tech, this is called vertical scaling — upgrading the same machine. More RAM, a faster CPU, a bigger disk. You're getting more out of what you already have.
It helps. But there's a ceiling. One person, no matter how fast, can only handle so many fans.
2. Preprocessing with Cron Jobs — Print the Tickets Before the Rush
Here's a smarter move. You know the match is on Saturday. So on Thursday night, when there's zero footfall, you pre-print all the ticket stubs, pre-sort them by section, and get everything ready.
When fans show up on Saturday, half the work is already done.
In systems, this is preprocessing with cron jobs — scheduled tasks that run at off-peak hours to prepare data in advance. When the actual request hits, the server doesn't have to compute everything from scratch.
Do the heavy lifting before the demand arrives. Your system will feel 10x faster to users even if the underlying work is the same.
3. Backup Servers — Don't Rely on One Counter
What if your one counter person falls sick on match day? The queue freezes. No tickets get sold. That person is a single point of failure.
The fix is obvious — have a backup counter person trained and ready to step in.
In systems, this is a primary-replica (master-slave) architecture. Your primary server handles all the work. If it crashes, the replica immediately takes over. Users barely notice anything happened.
Any single component whose failure brings down your entire system is a single point of failure. Always have a backup.
4. Horizontal Scaling — Open More Counters
Your backup counter person is now a full-time employee. And you still need more. So you open 10 counters, each with a trained person.
This is horizontal scaling — instead of making one machine more powerful, you add more machines of the same type. More servers, more capacity, more throughput — all working in parallel.
This is the backbone of how companies like Amazon scale during sales events. Not one giant server, but thousands of regular ones working together.
5. Microservices — Separate Counters for Separate Jobs
Now you have 10 counters. But think about this — some fans are buying tickets, some are picking up pre-booked tickets, some are asking about parking passes, and some just want to know where Gate 7 is.
If every counter handles everything, it's chaos. A better approach: dedicate counters to specific jobs.
- Counters 1–5 → New ticket bookings
- Counters 6–7 → Pre-booked ticket pickup
- Counter 8 → Parking passes
- Counter 9 → Fan queries and directions
Each counter has one clear responsibility. If there's a problem with parking passes, you fix Counter 8 — you don't touch the rest. If ticket demand spikes, you add more booking counters without disrupting anything else.
This is microservices architecture — small, independent services each doing one job well.
| Counter | Responsibility | Staff Needed |
|---|---|---|
| 1–5 | New ticket bookings | 5 |
| 6–7 | Pre-booked pickup | 2 |
| 8 | Parking passes | 1 |
| 9 | Fan queries | 2 |
6. Distributed Systems — Don't Keep Everything in One Stadium
Your stadium is running perfectly. Then one day the power goes out. Or a pipe bursts. The match gets cancelled — and so do all your systems.
What if, instead of one central ticketing office, you had regional offices across the city? Fans in Bangalore book from the Bangalore office. Fans in Mumbai book from the Mumbai office.
Now even if one office goes down, the others keep running. And fans get faster responses because they're connecting to something nearby.
This is a distributed system — and it's exactly what Facebook, Google, and every major tech company does. They have servers all over the world. Your request is handled by the server closest to you, giving you a faster experience.
Distributing your system makes it more fault-tolerant and faster. The tradeoff is complexity — the offices now need to stay in sync with each other.
7. Load Balancing — The Smart Gate Controller
Match day. 100,000 fans trying to enter the stadium at the same time. You have 20 gates.
If every fan rushes to Gate 1, there's a stampede. Gate 20 sits empty. Total chaos.
You need someone — or something — at the entrance that looks at all 20 gates in real time and directs each fan to the one with the shortest queue. That's a load balancer.
Fan Arrives at Stadium
|
Gate Controller (Load Balancer)
/ | | | \
Gate1 Gate2 Gate3 Gate4 Gate5
(full)(free)(free)(busy)(empty) ← fans are routed smartly
In systems, the load balancer sits in front of all your servers and routes each incoming request to whichever server is least busy. It keeps things fast, and if one server crashes, it quietly stops sending traffic there.
8. Decoupling — The Commentators Don't Run the Scoreboard
Here's something obvious when you think about it: the commentary team and the scoreboard operators have completely different jobs. The commentators don't update the score — the scoreboard team does. And the scoreboard team doesn't care what the commentators are saying.
They're independent. A problem with the commentary feed doesn't affect the scoreboard. A scoreboard glitch doesn't interrupt the commentary.
This is decoupling — separating parts of your system that have different responsibilities so they can operate and evolve independently. Changes to one don't break the other.
In backend systems, your payment service shouldn't care how the notification service works. Your video streaming service shouldn't depend on your ticketing service. Decouple them, and your system becomes far easier to maintain and scale.
9. Logging and Metrics — The Match Analytics Room
Imagine a ball-by-ball data analyst sitting in a control room, recording every event: 6:14 PM — boundary, 6:15 PM — wicket, 6:17 PM — wide ball. That's logging — a timestamped record of everything that happens.
Now imagine another analyst taking all those logs and producing a report: run rate this over, average wickets per session, peak fan activity periods. That's metrics.
In your system, if response times spike or orders start failing, your logs tell you exactly when it started and what happened. Your metrics help you spot the pattern before it becomes a full-blown outage.
Log every event. Derive metrics from those logs. Build dashboards. You can't debug what you didn't record.
10. Extensibility — The Stadium That Hosts More Than Cricket
A well-built stadium doesn't just host cricket. It hosts concerts, football matches, award ceremonies. The infrastructure — seats, lights, sound systems, parking — works for all of them.
The same principle applies to software. Your ticketing system shouldn't be hard-coded for cricket. If you build it right, it should be able to sell tickets for any event — a concert tomorrow, a kabaddi match next week.
This is extensibility — designing your system so that adding new features or use cases doesn't require tearing everything down and starting over.
The Full Picture
Here's everything mapped out:
| Concept | Stadium Analogy | Tech Term |
|---|---|---|
| Faster counter setup | Upgraded billing system | Vertical Scaling |
| Pre-printing tickets Thursday night | Off-peak prep work | Cron Jobs / Preprocessing |
| Backup counter person | Covers when primary is down | Backup Servers / Replicas |
| Opening 10 counters | More people, more capacity | Horizontal Scaling |
| Dedicated counters per job | Specialists per function | Microservices |
| Regional ticketing offices | Multiple locations | Distributed Systems |
| Smart gate controller | Routes fans to free gates | Load Balancer |
| Commentary vs scoreboard teams | Independent operations | Decoupling |
| Ball-by-ball data + analyst reports | Events and summaries | Logging & Metrics |
| Stadium hosts any sport or event | Plug-and-play design | Extensibility |
What Comes Next?
What we just walked through is High Level Design (HLD) — how big pieces of your system are structured, where they live, and how they talk to each other.
The counterpart is Low Level Design (LLD) — the actual code: classes, functions, interfaces, design patterns. That's where the rubber meets the road.
HLD tells you what to build and why. LLD tells you how to build it cleanly.
Get both right, and you're thinking like a senior engineer.