Why does starting with components go wrong?
When you start with components, you are answering a question you have not yet asked. You do not know yet whether you need a cache, because you do not know what the read patterns are. You do not know whether you need a queue, because you have not established whether the system needs to decouple producers from consumers. The components are answers. The mistake is reaching for answers before the questions are clear.
This pattern is common in both job interviews and in actual system design sessions. Someone asks "design a URL shortener" and the response is "we will have a load balancer, a database, and maybe Redis for caching." That sentence contains at least four assumptions that have not been established: that traffic is high enough to need a load balancer, that the read pattern justifies a cache, that the data model suits the chosen database, and that the latency requirements are known.
The 7-step process I am working through is not a checklist for interviews. It is a structure for keeping the reasoning ahead of the architecture.
Step 1: What are the requirements, exactly?
Before anything else, clarify what the system needs to do and what it does not need to do. This splits into two parts.
Functional requirements define the system's behavior: what it does for users. For a URL shortener, that might be: accept a long URL and return a short one, redirect users who visit the short URL to the original, optionally support custom slugs, optionally show click analytics.
Non-functional requirements define the system's constraints: how well it does those things. How many reads per second? What is the acceptable latency for a redirect? Is data loss acceptable? Does the system need to be globally available or is a single region sufficient?
The non-functional requirements change the architecture more than the functional ones. A URL shortener serving 100 requests per second looks very different from one serving 10 million. Getting the scale and reliability requirements stated early prevents you from designing the wrong system.
Step 2: How big does this system actually need to be?
Back-of-the-envelope estimation is how you translate vague scale requirements into specific design constraints. The goal is not precision. It is finding the order of magnitude that tells you which design decisions are mandatory and which are premature.
The three numbers worth estimating:
- QPS. Take the daily active user count, estimate how many requests each user generates per session, divide by seconds in a day, then multiply by the peak-to-average ratio. A system handling 500 QPS on average with a 5x peak needs to sustain 2,500 QPS. That single number determines whether a single database instance is sufficient or whether you need read replicas.
- Storage. Estimate average object size times objects created per day times retention period. Add the replication factor. A URL shortener storing 100 million mappings at 500 bytes each needs 50 GB before replication. That tells you whether you are on a single database node or a distributed one.
- Bandwidth. QPS times average response size gives you inbound and outbound bandwidth. This determines whether serving assets from your application servers is feasible or whether a CDN is necessary.
What I find useful about this step is that it often resolves decisions that felt ambiguous. If the storage estimate is 200 GB, the database question looks different than if it is 200 TB.
Step 3: What does the API actually look like?
Defining the API before the internal architecture forces you to think about the system from the outside in. The API is the contract between your system and its clients. Getting it wrong means breaking changes later.
For each core user action identified in the requirements, define:
- The endpoint and HTTP method
- The request parameters and body
- The response shape and status codes
- The error cases
For a URL shortener, this might be:
POST /shorten
body: { url: "https://example.com/long-path", custom_slug?: "mylink" }
response: { short_url: "https://short.ly/abc123" }
GET /{slug}
response: 301 redirect to original URL
errors: 404 if slug not found
Two things become visible at this step that are not visible from a component diagram. First, the API reveals the consistency requirements: if two users shorten the same URL, do they get the same short URL or different ones? That question drives a storage and deduplication decision. Second, the API reveals the idempotency requirements: if a client calls POST /shorten twice with the same URL, what should happen?
Step 4: What does the high-level design look like?
With requirements, scale estimates, and an API defined, you can now sketch a high-level architecture. At this stage, the goal is to identify the major components and the data flow between them, not to specify every detail.
For most systems, the high-level diagram includes:
- The client and what protocol it uses to communicate
- The load balancer or API gateway that fronts the service
- The application servers that handle requests
- The primary data store
- Any caching layer, message queue, or CDN that the requirements have already justified
The constraint at this step is that every component on the diagram should have been earned by a requirement or an estimation result. If a cache appears because "we might need performance," it has not been earned. If a cache appears because the estimation showed 50,000 reads per second against a single PostgreSQL instance, it has been earned.
Step 5: Where does the complexity actually live?
The high-level design is a skeleton. The deep dive is where the real design decisions happen. At this step, pick the one or two components where the interesting trade-offs are, and work through them in detail.
For a URL shortener, the interesting component is probably the slug generation and storage. The questions worth going deep on:
- How is the slug generated? Random characters risk collisions. A counter is sequential but requires a coordinated ID generator across distributed writes. A hash of the URL is deterministic but may collide and requires a collision resolution strategy.
- How is the mapping stored? Key-value lookup by slug is simple. But if you need analytics, you need a separate data model for click events. If you need custom slugs, you need a uniqueness check before writing.
- How do you serve redirects fast? A redirect that hits the database on every request will bottleneck under load. A cache in front of the database absorbs reads, but invalidation is simple here: short URLs rarely change, so TTLs can be long.
The deep dive is where the system design becomes specific. The high-level diagram answers "what are the components?" The deep dive answers "how do the critical components actually work?"
Step 6: What breaks first under load?
After designing the happy path, identify the bottlenecks: the components that will fail or degrade first as load increases.
Common bottleneck patterns:
- The database write path. A single primary database with synchronous writes will eventually saturate. Identify the write rate from your estimation, check it against the database's write throughput, and decide whether you need write sharding or async replication before you hit the limit.
- The cache invalidation boundary. If the cache TTL is too long, users see stale data. If it is too short, cache hit rate drops and every request hits the database. The right TTL depends on how frequently the underlying data changes and how much staleness is acceptable.
- The single points of failure. Any component that exists as a single instance is a single point of failure. For a system that needs high availability, identify every single instance and decide whether it needs redundancy or whether the failure mode is acceptable.
Identifying bottlenecks at design time is the difference between a system that degrades gracefully and one that collapses in ways you did not anticipate.
Step 7: What are the trade-offs you made, and why?
Every design choice forecloses other choices. The last step is to make those trade-offs explicit, because they are what a technical reviewer or a future engineer maintaining the system needs to understand.
The trade-offs worth articulating:
- What consistency model did you choose, and what does that mean for users? If you chose eventual consistency for the redirect lookup, what is the window during which a newly shortened URL might return a 404?
- What availability target are you designing for, and what does it cost? A system with three replicas has a different failure profile and operational cost than a system with one.
- What are you not solving? Scope is a design decision. A URL shortener that does not support analytics, custom domains, or expiring links is simpler to build and operate. Naming what is out of scope is as important as naming what is in scope.
The designs that hold up in review are the ones where the author can explain why each major choice was made and what they gave up to make it. The designs that do not hold up are the ones where components exist without justification, or where trade-offs were made implicitly without being examined.
What does internalizing this sequence actually look like?
The sequence is not difficult to memorize. The difficulty is slowing down enough to follow it when the instinct is to jump to the architecture. The failure mode in both interview settings and real design sessions is the same: premature specificity. The diagram fills up before the requirements are clear, and the trade-offs get invented post-hoc to justify decisions that were made on intuition.
What I am trying to build is the habit of staying in requirements longer than feels comfortable, getting the estimation done before touching the architecture, and treating the API as a forcing function that reveals assumptions before they get buried in implementation.
The components are the easy part. Knowing which ones belong in this system, at this scale, with these trade-offs, is the actual skill.