Velocity Tracking
How we measure and interpret delivery speed to forecast reliably and spot trends early.
Overview
Velocity measures how many story points a team completes per sprint. It is one of the most useful and most misused metrics in agile delivery. Used correctly, it is a forecasting tool that tells you how much work a team is likely to complete in a given time window. Used incorrectly, it becomes a pressure mechanism that incentivises gaming, inflated estimates, and shortcuts that accumulate as technical debt.
For using velocity to plan sprint capacity, see Sprint Planning Discipline. For managing the backlog that velocity helps forecast, see Backlog Refinement Practices.
Why It Matters
Velocity is the foundation of reliable forecasting. Without a consistent measure of delivery rate, every roadmap date is a guess. With velocity data from the last 6–8 sprints, you can make probabilistic commitments based on evidence.
Velocity trends surface problems before they become crises. A consistent downward trend in velocity — not explained by capacity changes — is a signal. It may indicate accumulating technical debt, increasing defect rate, growing team friction, or scope creep. Catching the trend early creates space to investigate; missing it creates a surprise.
Predictability matters more than speed. A team that consistently delivers 60 points per sprint is more valuable to plan around than a team that delivers 40 points one sprint and 90 the next. Predictability enables stakeholders to make reliable commitments to customers and other teams.
Velocity is a team metric, not an individual one. Individual developer velocity is not a useful or appropriate measure. Velocity reflects team capacity, process health, story quality, and environmental factors — none of which are individual.
Measuring velocity creates accountability to the commitment. Teams that do not measure velocity tend to drift: individual developers deprioritise sprint stories for other work, meetings expand, and the sprint goal quietly erodes. Tracking velocity makes the commitment visible.
Standards & Best Practices
What counts toward velocity
Only completed stories count toward velocity. A story is complete when:
- All acceptance criteria are met
- It has passed QA or testing
- It meets the team's Definition of Done
A story that is 90% complete at the end of a sprint counts as zero. This rule is the most frequently argued and the most important to hold. Partial credit creates invisible carry-over, inflates apparent velocity, and obscures how much work is actually in flight.
Rolling average velocity
Do not use a single sprint's velocity for planning. Use a rolling average of the last 3–4 completed sprints. A rolling average smooths the natural variation from sprint to sprint (holidays, incidents, unusually complex stories) and gives a stable baseline for forecasting.
Rolling velocity (3-sprint) = (Sprint N + Sprint N-1 + Sprint N-2) / 3When capacity changes significantly (team size, extended leave), reset the rolling average from the new state.
Interpreting velocity changes
| Change | Likely causes | Action |
|---|---|---|
| Gradual decline over 3+ sprints | Technical debt accumulation, increasing defect rate, team friction, scope creep on stories | Investigate; add a retrospective topic |
| Sharp single-sprint drop | Unplanned leave, incident response, vacation period | Note and exclude from rolling average if anomalous |
| Gradual increase | Team ramp-up, improved refinement quality, better story sizing | Adjust rolling average upward; validate with the team |
| High variance sprint-to-sprint | Inconsistent story sizing, mid-sprint scope changes, incomplete refinement | Address root causes, not the symptom |
Predictability ratio
Velocity alone does not tell you about consistency. Track the predictability ratio alongside velocity:
Predictability ratio = Points delivered / Points committedA ratio of 0.85–1.0 is healthy. Consistently above 1.0 suggests under-commitment. Consistently below 0.85 suggests overcommitment, scope change, or stories arriving under-prepared.
Signs velocity is being gamed
- Story point estimates increase without a corresponding change in story complexity
- Stories are consistently split at end-of-sprint to claim points for partial work
- "Done" criteria are informally relaxed to get stories across the line
- Velocity looks healthy but defect rates are rising
When you see these signs, the underlying pressure is usually the problem — not the people. Explicit pressure to hit velocity targets produces gaming. Velocity should be descriptive (what did we do?), not prescriptive (what must we do?).
How to Implement
Sprint-by-sprint tracking
After each sprint review:
- Record completed points (stories that fully met DoD)
- Record committed points (stories in the sprint at planning)
- Calculate predictability ratio
- Update rolling average
- Note any anomalies that should be excluded from trend analysis
Using velocity for release forecasting
To estimate when a set of features will be complete:
- Total the estimated story points for all remaining features
- Divide by the rolling average velocity
- The result is the number of sprints needed
- Add a buffer of 10–20% for estimation error and unplanned work
Example:
- Remaining backlog: 240 points
- Rolling average velocity: 60 points/sprint
- Estimated sprints: 240 / 60 = 4 sprints
- With 15% buffer: ~4.6 sprints → plan for 5 sprints
This gives a date range, not a single date. Communicate it as such: "Based on current velocity, we expect to complete this in sprint 7–8."
Velocity review in retrospective
At the sprint retrospective, spend 5 minutes on velocity health:
- Did we deliver what we committed?
- Were there any stories we carried over? Why?
- Did any sprint events (incidents, scope changes) impact velocity in a way that distorts the trend?
- Is the rolling average still a reliable baseline for next sprint's planning?
Tools & Templates
Velocity tracking table
## Team Velocity — [Team name]
| Sprint | Committed | Delivered | Predictability | Notes |
| ------------------------ | --------- | --------- | -------------- | -------------------- |
| Sprint 1 | 65 | 58 | 89% | — |
| Sprint 2 | 60 | 62 | 103% | — |
| Sprint 3 | 65 | 55 | 85% | 1 developer on leave |
| Sprint 4 | 60 | 60 | 100% | — |
| **3-sprint rolling avg** | | **59** | **93%** | |Forecast template
## Release Forecast — [Feature or milestone name]
**Remaining backlog:** [X] points
**Rolling average velocity:** [Y] points/sprint
**Estimated sprints remaining:** [X/Y] = [N] sprints
**Buffer (15%):** +[0.15N] sprints
**Forecasted completion:** Sprint [N+buffer] ([approximate date])
**Assumptions:**
- No significant team capacity changes
- Backlog estimate is within ±20% of actual scope
- Current velocity trend holdsCommon Pitfalls
Using velocity as a performance target. "The team must hit 70 points this sprint" produces exactly the wrong outcomes: inflated estimates, gaming, and shortcuts. Velocity describes what happened; it does not define what must happen.
Planning from theoretical capacity instead of velocity. A team with 200 developer-hours available does not automatically have 200 developer-hours of output. Meetings, interruptions, environment setup, and code review time are real. Historical velocity already accounts for all of this; theoretical capacity does not.
Comparing velocity across teams. Story points are relative within a team, not absolute across teams. A team that averages 100 points per sprint is not necessarily doing more work than one that averages 50. The scales are different. Cross-team velocity comparisons are meaningless and create counterproductive pressure.
Including incomplete stories in velocity. The most corrosive habit in velocity tracking. Once teams learn that "almost done" counts, stories are routinely declared "almost done" and completed stories disappear from retrospective accountability. The rule is binary: done or not done.
Ignoring the trend, reporting the number. A team reporting "we delivered 62 points this sprint" without flagging that this is the fourth consecutive decline is using velocity as a reporting metric rather than a diagnostic one. The trend is the signal.
Not adjusting the baseline for team changes. A team that adds two developers should expect higher velocity after a ramp-up period. A team that loses a senior developer should expect lower velocity. Using the old rolling average for forecasting during a transition produces systematically wrong estimates.