Rollout Strategies

Feature flags, canary releases, and phased rollouts for safe, controlled deployments.

Overview

A rollout strategy determines how and how fast a change reaches its full intended audience. Full immediate releases — deploying to everyone at once — are simple but carry maximum risk. Controlled rollout strategies trade some deployment simplicity for the ability to observe, measure, and reverse a change while it is still affecting a limited population. For significant changes, the trade-off is almost always worth making.

For the checklist of conditions that must be met before any rollout begins, see Launch Checklists. For assessing which rollout strategy matches the risk level of a release, see Risk Assessment Frameworks.


Why It Matters

Rollouts limit the blast radius of failures. A bug that affects 2% of users is an incident. A bug that affects 100% of users is a crisis. Phased rollouts give the engineering team time to detect problems before they reach full scale.

Controlled rollouts create natural experiments. Users in the rollout cohort versus users not yet in it create a natural A/B comparison. This allows the team to confirm that key metrics are moving as expected before committing to full release.

Feature flags decouple deployment from release. Code can be deployed to production — tested, stable, in the codebase — without being active for any users. The feature is activated separately, on a schedule that makes sense for the business, not the deployment pipeline.

Kill switches are the fastest rollback path. Disabling a feature flag is faster, safer, and less disruptive than reverting a code deployment. For user-facing features, a feature flag kill switch should always exist.

Phased rollouts build internal confidence. Releasing to 5% of users first gives the team direct evidence that the feature behaves correctly in production before the full audience is affected. This evidence replaces the anxiety of a full release with an informed decision.


Standards & Best Practices

Rollout strategy selection

StrategyBest forRisk level it handles
Full releaseInternal tools, back-end changes with no user impact, hotfixesLow
Feature flagsUser-facing features of any sizeLow to High
Canary releaseInfrastructure changes, performance-sensitive changesMedium to High
Phased rolloutSignificant user-facing features, major flowsHigh
Dark launchPerformance testing, infrastructure validationAny
Beta programNew major features requiring real user feedbackHigh

Feature flags

A feature flag is a configuration value that enables or disables a feature for a specific set of users without a code deployment. Feature flags allow:

  • Deploying code before it is released to users
  • Targeting specific user segments (internal users, beta users, specific geographies)
  • Gradual percentage-based rollout
  • Immediate kill switch without a deployment

Feature flag rules:

  • Every significant user-facing feature should be behind a flag at launch
  • Flags should have an owner and an expiry plan (flags are not permanent)
  • Flag names should be descriptive: checkout_v2_redesign, not feature_flag_3
  • Document flag purpose and expected lifetime in the flag configuration

Flag states:

  • off — Nobody sees the feature
  • internal — Visible only to employees or specified internal accounts
  • beta — Visible to opted-in beta users
  • gradual — Visible to X% of users, increasing over time
  • on — Visible to all users

Canary releases

A canary release deploys a new version of the service to a small subset of servers or instances, while the majority continues running the previous version. Traffic is split at the infrastructure level.

Canary releases are most useful for:

  • Infrastructure changes (new caching layer, new database version)
  • Performance-sensitive changes (changes to critical request paths)
  • Changes where feature flags are not sufficient (e.g., architectural changes)

Monitor canaries for at least 30–60 minutes before expanding. Canary rollback is a traffic routing change — faster than reverting code.

Phased rollout

A phased rollout releases to increasing percentages of users over time, monitoring at each stage before expanding:

PhaseAudienceDurationCondition to advance
Phase 15%24–48 hoursNo increase in error rate; adoption curve healthy
Phase 225%24–48 hoursPhase 1 metrics stable; no reported issues
Phase 350%24 hoursPhase 2 metrics stable
Phase 4100%Phase 3 metrics stable

Condition to advance: the team must actively decide to expand, not just let time pass. This requires a named person responsible for reviewing metrics at each stage.

Dark launches

A dark launch deploys a feature to production in a dormant state — the code is running, processing requests, but its output is not shown to users. Used to:

  • Test performance and stability under real production load
  • Validate that a new service handles production traffic correctly
  • Pre-warm caches or indexes before the feature goes live

Dark launches are not a substitute for staging testing — they are a complement to it.


How to Implement

Rollout plan document

Before any rollout, document:

## Rollout Plan — [Feature name]

**Strategy:** [Full release / Feature flag / Canary / Phased / Dark launch]

**Phases:**
| Phase | Audience | Start date | Advance condition | Owner |
|---|---|---|---|---|
| Phase 1 | 5% | [Date] | Error rate < [X]%, adoption > [Y]% | [Name] |
| Phase 2 | 25% | [Date] | Phase 1 stable for 48hr | [Name] |
| Phase 3 | 100% | [Date] | Phase 2 stable for 24hr | [Name] |

**Feature flag name:** [flag_name]
**Kill switch:** Yes — disable [flag_name] immediately reverts all users to previous experience

**Rollback plan:** [Describe — feature flag disable / traffic routing change / code revert]
**Rollback trigger:** Error rate > [X]% / P95 latency > [Y]ms / [other]
**Rollback decision owner:** [Name]

Monitoring during rollout

At each rollout phase, actively monitor:

  • Error rate (should not increase vs baseline)
  • P95 latency (should not increase significantly)
  • Core flow completion rate (should not decrease)
  • Feature adoption rate (should follow expected curve)
  • Support ticket volume (should not spike)

Set up alerts for rollback triggers before Phase 1 begins.

When to pause a rollout

Pause and investigate (do not immediately roll back) when:

  • Metrics are higher than baseline but within a plausible range (may be expected)
  • Small increase in support tickets but no system impact

Roll back immediately when:

  • Error rate exceeds defined trigger
  • Core flow success rate drops below defined trigger
  • Customer-impacting bug is reported and confirmed

Tools & Templates

Feature flag configuration (example)

## Feature Flag: [flag_name]

**Feature:** [Brief description]
**Owner:** [Product manager name]
**Created:** [Date]
**Expected retirement date:** [Date — when will this flag be removed?]

**State:** off / internal / beta / gradual (X%) / on

**Targeting rules:**

- Internal: all accounts with @company.com email
- Beta: accounts in segment [beta_users]
- Gradual: [X]% of all eligible users, increasing by [Y]% every [Z] days

**Kill switch:** Changing state to `off` immediately disables the feature for all users.

**Notes:** [Any context on why this flag exists or rollout history]

Phased rollout tracker

## Rollout Tracker — [Feature name]

| Phase   | %    | Start  | Advance condition           | Metrics at advance      | Decision   | Date advanced |
| ------- | ---- | ------ | --------------------------- | ----------------------- | ---------- | ------------- |
| Phase 1 | 5%   | [date] | Error < 1%, latency < 300ms | Error: 0.3%, P95: 210ms | ✅ Advance | [date]        |
| Phase 2 | 25%  | [date] | Phase 1 stable 48hr         | ...                     |            |               |
| Phase 3 | 100% | [date] | Phase 2 stable 24hr         |                         |            |               |

Common Pitfalls

No kill switch. Deploying a feature without the ability to disable it quickly means that rollback requires a code revert and deployment — a process that takes minutes at best, hours at worst. Every user-facing feature should have a flag-based kill switch.

Advancing phases on a schedule, not on evidence. Phased rollouts where phases advance automatically on a timer rather than after someone reviews the metrics provide false safety. The advance condition must include a human review of the metrics, not just the passage of time.

Feature flags that never retire. Flags accumulate. Code conditioned on flags that are permanently on is dead code with extra overhead. Every flag should have an expected retirement date. Flags past their retirement date should be cleaned up in the next sprint.

Canary monitoring that is too short. A 5-minute canary window detects obvious failures but misses slow-developing issues (memory leaks, connection pool exhaustion) that appear after sustained traffic. Canary monitoring windows should be at least 30–60 minutes for significant changes.

Not communicating rollout phase changes. A rollout that expands from 5% to 25% without notifying customer-facing teams (support, account management) means those teams are unprepared for the increase in customer questions. Communicate phase expansions internally, even if they are not customer announcements.

Treating phased rollout as a substitute for pre-launch testing. Phased rollouts reduce the blast radius of failures; they do not replace the obligation to test the feature thoroughly before launching at all. A bug that affects 5% of users is still a bug.