API Design Standards
Designing APIs that are intentional, consistent, and evolvable — the contracts that outlive their services.
Overview
An API is a promise. Once it has consumers, every shape of every endpoint is a contract the system agreed to honour. The cost of breaking that contract is not absorbed by the team that ships the change; it is absorbed by every team that has written code against the old shape. APIs that are designed carefully are a multiplier on the organisation's velocity. APIs that are designed carelessly are a tax on it — paid indefinitely.
This page is about the design of APIs: the surface a team decides to expose, the conventions it follows, and the contract it commits to. It is not about implementation detail (which framework, which serialisation format) except where those details are indistinguishable from design.
For documenting the decision to adopt a particular API style, see Architecture Decisions. For evaluating API designs before they ship, see Design Review Frameworks.
Why It Matters
APIs last longer than the teams that write them. A service gets rewritten, split, or replaced; its API often does not, because the consumers are still there. The team designing an API today is committing the organisation to a surface it will live with for years.
A bad API forces every consumer to invent a workaround. Inconsistent naming, confusing pagination, unclear error semantics — each consumer works around these in its own way. The cost is not on the API team; it is diffuse, invisible, and enormous.
Breaking changes compound. A single breaking change coordinated across three teams is expensive. A year of continuous breaking changes becomes unmanageable — teams start pinning to old versions, refusing to upgrade, or forking client logic. The API is still there but no longer evolves.
The cheapest time to get an API right is before it has consumers. After the first real integration, every change is a negotiation. Design review is disproportionately valuable for APIs because the design mistakes are disproportionately expensive to fix later.
Standards & Best Practices
Pick a style deliberately and commit to it
The organisation should have a default API style (REST, GraphQL, gRPC/RPC, or event-based), chosen on the merits of what the systems actually need — not chosen per-service based on who implemented it. Mixing styles across services that speak to each other multiplies the cognitive cost of integrating any two of them.
There are valid reasons to diverge from the default: a bulk data pipe may want a different protocol than an interactive API; a public-facing API may have different requirements than an internal one. But divergences should be deliberate and documented, not accidental.
Design the API before writing it
The implementation should follow from an explicit API design — not emerge as a side-effect of whatever shape is convenient for the current code. An API shape that mirrors the database schema, the internal model, or the framework idiom is often wrong for consumers. Design from the consumer's side first: what does a client need to do? What are the smallest, most useful operations that support those needs?
A design doc for a non-trivial API is one of the highest-leverage artifacts a team can produce. It forces the author to name the contract before committing to it, and gives reviewers the opportunity to catch problems while they are still cheap.
Names are the surface
Consumers read names before they read documentation. Names are also the part of the API least likely to change once it has consumers.
Standards:
- Names describe what a resource is or what an operation does, not how it is implemented
- Consistency beats cleverness — the same concept uses the same word across the whole API
- Plurals, casing, and ordering are standardised across the surface (e.g.
/orders/{id}/line-items, not/Order/{id}/lineItem) - Boolean fields are named in the affirmative (
is_active, notdisabled) to avoid negation confusion - Timestamps have explicit units and timezones in the name (
created_at_utc) — implicit "server time" causes bugs that are difficult to debug
Idempotency is a first-class concern
An idempotent operation produces the same result whether it is called once or many times. Without idempotency, clients cannot safely retry — a dropped response becomes a distributed systems problem for the consumer.
Standards:
- Read operations are naturally idempotent; preserve that by not introducing hidden side effects
- Write operations that should be idempotent carry an explicit client-generated idempotency key, and the server deduplicates
- Operations that genuinely cannot be idempotent (e.g. "append to log") have their semantics documented prominently
A non-idempotent API forces every client to implement its own deduplication, usually badly.
Versioning is a commitment to compatibility, not a toggle
API versions exist to allow the API to evolve without breaking consumers. The policy is:
- Additions (new endpoints, new optional fields) are backward compatible and do not require a new version
- Removals, renames, or semantic changes are breaking and require a new version
- A new version does not delete the old version immediately — there is a sunset window during which both exist
- The sunset window is long enough for realistic client upgrades (often months, sometimes longer)
Versioning in the URL (/v1/...) is explicit and discoverable; header-based versioning is also valid but puts the version off-screen from the reader. Pick one and apply it consistently.
Error responses are part of the API
Errors are as much part of the API as the happy path, and should be designed with the same care. Consumers branch on errors; the error contract determines what kinds of recovery are possible.
Standards:
- Errors use a consistent structure across the whole API (code, message, details, trace identifier)
- Error codes are stable machine-readable strings; error messages are human-readable and may change
- HTTP status codes are used for their defined meanings (4xx = client error, 5xx = server error) — not repurposed
- The error payload includes enough context for a user-facing system to construct a helpful message, without leaking internal details
- Validation errors include field-level detail so a UI can point to the wrong input
An API that returns 500 with {"error": "failed"} for every kind of failure has given the consumer nothing to act on.
Pagination, filtering, and sorting are designed up front
Any endpoint that returns a collection eventually returns a large one. Adding pagination later is a breaking change; designing it in at the start is nearly free.
Standards:
- Every collection endpoint is paginated from day one, with a default page size
- Pagination uses cursors for large or changing datasets, offsets only for small stable ones
- Sort and filter parameters are named and typed consistently across endpoints
- The response includes a signal for "is there more?" that does not require the client to do arithmetic
Design for backward compatibility by default
Evolving an API without breaking consumers is a discipline. The baseline rules:
- Never rename a field in-place — add a new field, deprecate the old one, remove it after the sunset window
- Never change the type of a field — add a new field of the new type
- Never change what a field means in a way existing clients would misinterpret — version the endpoint
- Never make an optional field required — it will break every client that relied on the default
- Removing a field is a breaking change, period
These rules feel restrictive the first time a team encounters them. They are what allows the API to keep evolving without coordinating every change with every consumer.
Time, pagination, and enumerations — the common failure modes
Three categories of field cause disproportionate pain when they go wrong:
- Time — Always include the timezone explicitly. Always use a single, parsable format (ISO-8601 with timezone). Ambiguous timestamps are the source of bugs that span continents.
- Pagination — Never change the pagination contract without a new version. Consumers hardcode their expectations.
- Enumerations — Adding a new value to an enum can break clients that exhaustively switch on it. Document up front whether a field is a closed set (never changes) or an open set (may add values), and design consumers accordingly.
Deprecation is a process, not an announcement
An endpoint that is "deprecated" but still used by half the consumers is not actually deprecated — it is still in production. The real process:
- Deprecation is announced with a sunset date
- Usage of the deprecated endpoint is measurable (logged with a flag, instrumented for call counts)
- Consumers are actively notified and migrated, not passively expected to notice
- Before the sunset date, usage is verified to be zero
- Only then is the endpoint actually removed
Deprecations that are not instrumented and not driven are deprecations that never complete.
Public APIs and internal APIs live by different rules — but not different principles
A public API has less tolerance for breaking changes, longer sunset windows, and more formal documentation. An internal API can evolve more freely with willing consumers. The principles are the same: design deliberately, version changes, design for compatibility. The budget for compatibility is what differs — public APIs have less room to change, internal ones have more.
The distinction fails when "internal" APIs acquire consumers outside the original team and become de facto public without being promoted. Treat this possibility seriously — an internal API that multiple teams depend on deserves the discipline of a public one.
How to Implement
Design checklist for a new endpoint
When adding an endpoint, the design should answer:
- What resource does this act on? What verb describes the action?
- What is the contract: inputs, outputs, errors?
- Is it idempotent? If yes, how is that enforced? If no, is that documented?
- What is the authentication/authorisation requirement?
- How is it paginated or bounded?
- What does failure look like — what specific errors can the client receive?
- What observability is in place — request ID, latency, error rates?
- Does it fit the conventions of the existing API, or does it diverge? If it diverges, why?
Evolution checklist for an existing endpoint
Before shipping a change to an existing endpoint:
- Is this change backward-compatible? (Added optional field = yes; renamed field = no)
- If breaking, what is the new version and what is the sunset plan for the old one?
- Are known consumers informed?
- Is there an automated test that would have caught this as a break?
- Is the change documented before it ships, not after?
Documentation is part of the API
An API with excellent documentation is the one that gets adopted correctly; an API with thin documentation is the one every consumer integrates slightly wrong. Documentation standards:
- Every endpoint has a description, not just a signature
- Every field has a description explaining what it is (semantic), not just what its type is (syntactic)
- Every error has a description of what the client should do in response
- Examples show realistic request/response pairs, not trivial ones
- Changelog records every change, not just breaking ones
Generated documentation from a schema (OpenAPI, Protobuf, GraphQL SDL) is vastly preferable to hand-maintained documentation, because hand-maintained documentation drifts.
Common Pitfalls
The "let's add a field" trap. Adding one field to one endpoint seems cheap. Over two years, a hundred fields get added, each reasonable in isolation, and the endpoint becomes an incoherent collection of flags and options that no one designed. The cost is paid by every future consumer trying to understand what the endpoint is.
Chatty endpoints forcing N+1 calls. An endpoint that returns a list of resources without any of their related data, forcing the client to make one call per resource. This is almost always a design mistake — either include the common related data, or offer a field-selection mechanism.
Leaking internal structure. Field names taken directly from database columns. Error messages that contain stack traces. Enums whose values are internal codes. These are all signals that the API is a projection of the implementation, not a designed surface. Consumers should not need to understand the internal model to use the API.
The data envelope nobody asked for. Wrapping every response in { "data": ... } for no operational reason. It adds a layer of structure that every consumer has to unwrap, with no payoff. If the envelope carries metadata (pagination, rate-limit info), it earns its place; if it is empty ceremony, remove it.
Error messages as the API. Consumers parsing error messages to decide what to do. Once this happens, the error message becomes part of the contract and cannot be changed. Prevent it: expose a machine-readable error code, and make clear that the message is advisory.
Versioning by default. Creating a new version for every change, under the theory that this is safer. In practice it produces many near-identical versions, consumers stuck on old ones forever, and deprecation processes that never complete. Version when the change is breaking; evolve in place when it is not.
"Internal" APIs that aren't. An endpoint labelled internal, used by three other teams, and changed without notice. Labelling does not change the operational reality that changes to this endpoint break consumers. Either make it properly internal (one caller, same team) or treat it with the discipline the consumer graph requires.
Never designing the API before implementing. The team builds the feature, the API emerges from the shape of the code, and by the time anyone looks at the API surface, it is already committed. Design first, even for the internal case. It costs less than one review cycle and prevents an entire class of problem.