Documentation Standards
What to document, how to write it, and how to keep it accurate — applied to READMEs, docstrings, and inline comments.
Overview
Code documentation exists at three levels: the repository level (READMEs, architecture docs), the interface level (docstrings, API comments), and the inline level (comments inside function bodies). Each level serves a different reader with a different question. A repository README answers "what is this and how do I run it?" A docstring answers "what does this function do and how do I call it correctly?" An inline comment answers "why does this specific line of code exist?"
Conflating these levels produces documentation that is too verbose where brevity is needed and too sparse where context is crucial. The goal is not more documentation — it is the right documentation, in the right place, for the right question.
For review standards that include documentation quality gates, see Code Review. For inline comment conventions as applied to code smells, see Code Smells.
Why It Matters
Undocumented codebases produce knowledge monopolies. When critical context exists only in the heads of the engineers who built the system, every team transition, onboarding, or incident becomes slower because the context is not transferable. Documentation is how institutional knowledge survives individual departures.
Wrong documentation is worse than no documentation. A README that describes a setup process that no longer works, or a docstring that incorrectly describes the return type, actively misleads. A reader with no documentation will search for the answer; a reader with wrong documentation will follow it into a mistake. Documentation quality must be maintained with the same discipline as code correctness.
Comments that explain "what" add noise; comments that explain "why" add value. Well-named identifiers already describe what the code does. A comment that restates what is already clear from the code is redundant noise that must be maintained. A comment that explains why a non-obvious decision was made — a constraint from an external system, a workaround for a known bug, an invariant that is not obvious from types — is the kind of context that saves hours.
Documentation is part of the interface. A function without a docstring requires its caller to read the implementation to understand the contract. A function with a clear docstring communicates the contract at the call site. Documentation reduces the cost of using an interface correctly.
Standards & Best Practices
READMEs: answer the first four questions a new engineer will have
A repository README is the first document a new engineer reads. Its job is to make the next 30 minutes productive:
Required sections for any production service or library:
- What is this? — One paragraph. What does this service/library do, and who uses it?
- How do I run it? — The minimum commands to get a working development environment
- How do I run the tests? — One command that runs the test suite
- How do I deploy it? — Or a link to where deployment documentation lives
Additional sections when relevant:
- Architecture overview (or link to architecture doc)
- Key dependencies and why they were chosen
- Known limitations or gotchas
- How to contribute (for open or internal libraries)
A README that answers more than these questions without linking to appropriate sub-documents is a README that will drift. Keep the README focused on the new-engineer experience; deeper documentation goes in dedicated files.
Docstrings: document the contract, not the implementation
A docstring describes the interface of a function, class, or module — the contract that callers rely on. It does not describe how the function works internally; that is what the code is for.
What belongs in a docstring:
- What the function does (in terms of input → output, not implementation steps)
- The meaning, type, and expected range of each parameter
- What the function returns, and what values it can return
- What exceptions or errors it raises, and under what conditions
- Any important side effects (mutation, I/O, network calls)
- Usage examples for non-obvious interfaces
What does not belong in a docstring:
- Implementation details that are visible in the code
- History ("added in version 2.3 when we refactored X")
- Comments about the author
- Prose that restates the function name
Docstring requirements by context:
- Public APIs (functions, classes, and methods that are called from outside the module) — always documented
- Non-obvious internal functions — documented when the contract or usage is not inferrable from the name and types
- Trivial internal helpers — may be undocumented if the name and types are self-explanatory
When in doubt: if a new engineer reading this function for the first time would need to read the implementation to understand how to call it correctly, the function needs a docstring.
Inline comments: explain why, not what
An inline comment adds value only when it explains something that cannot be expressed in the code itself: a non-obvious constraint, a workaround for a known issue, a business rule that is not derivable from the code structure.
Good reasons for an inline comment:
- A workaround for a known external bug (
# Rate limit on the external API is undocumented; empirically 100 req/s) - A business rule that is not obvious from the code (
# Free-tier users get 5 seats; contract establishes this at onboarding, not at creation time) - A non-obvious performance or correctness constraint (
# Must batch here; the upstream system rejects calls with > 500 items) - A to-do with context (
# TODO: remove after SDK v4 is stable in all regions)
Bad reasons for an inline comment:
- Restating what the code does (
i += 1 # increment i) - Describing what a well-named function does (
# calculate the total price) - Explaining standard language idioms to readers assumed to know the language
- Leaving commented-out code without explanation
The bar for an inline comment is: if this comment were removed, would a future engineer be likely to misunderstand, break, or delete this code? If yes, keep it. If no, remove it.
Naming as documentation
The most important documentation is often the name itself. A well-named function, variable, or class communicates its purpose without requiring any additional comment.
Standards:
- Function names describe what the function does in active terms (
calculateTaxableAmount, nottaxorprocessTax) - Boolean names read as assertions (
isEligibleForDiscount, notdiscountorcheckEligibility) - Collections use plural forms (
users,orderIds) - Variables that hold a computed or derived value encode the source (
discountedPrice = originalPrice * (1 - discountRate)) - Generic names (
data,result,temp,value) are almost always wrong at the scope of a function or module; they should be named for what they hold
If a good name makes a comment unnecessary, the name is better than the comment.
Documentation must be maintained
A document that does not accurately describe the current state of the system is a liability, not an asset. Maintenance standards:
- Documentation changes are part of the PR. A change to a public interface without a corresponding docstring update, or a change to a README setup step without verifying it still works, should be flagged in review.
- READMEs are verified on onboarding. The best time to catch a broken README is when a new engineer follows it and it does not work. New engineers should be asked to report README inaccuracies.
- Stale docstrings are bugs. A docstring that describes a parameter that was removed, or a return value that was changed, is incorrect code documentation — treat it as a bug.
How to Implement
Documentation review checklist
At code review, for any PR that changes public interfaces or adds significant new functionality:
- New public functions/classes/methods have docstrings
- Changed function signatures or return values are reflected in the docstring
- README is updated if setup, commands, or project structure changed
- Inline comments explain why, not what
- No commented-out code without explanation
Tooling for documentation quality
- Docstring linters (pydocstyle, ESLint jsdoc plugin, rustdoc warnings) can enforce docstring presence on public APIs as part of the linting gate
- Spell check in CI catches typos in documentation before they reach the main branch
- Link checkers detect broken links in READMEs and docs on each build
Common Pitfalls
Docstrings that restate the function name. getUser() with a docstring that says "Gets the user" adds nothing. If the function name is clear, the docstring should explain what is not clear from the name: which user, under what conditions, what happens when the user does not exist.
READMEs written once and never updated. A README that was accurate at launch but describes a tool, command, or dependency that was removed two years ago is a trap for the next new engineer. Make README accuracy part of the PR review process for changes that affect the setup experience.
Comments as version history. # Added in sprint 42 to handle the case reported by customer X. This information belongs in git history and commit messages, not in the code. Code comments that reference tickets, sprints, or historical incidents rot as the codebase evolves.
Over-commenting trivial code. A codebase full of comments for every line is harder to read, not easier. Readers learn to ignore comments when they are always restating the obvious; the genuinely useful comments get ignored along with the noise. Write fewer, better comments.
Underdocumenting complex interfaces. The inverse failure: a function with eight parameters, several of which interact in non-obvious ways, with no docstring. The person who calls this function for the first time has to read the implementation in its entirety to use it correctly. Document the contract; spare every future caller the archaeology.