EngineeringDevOps & Tooling

Containerisation Standards

Principles for building, securing, and operating container images that are minimal, reproducible, and safe to run.

Overview

A container image is a deployment unit: everything the service needs to run is packaged inside it. That packaging decision is a set of choices that directly affects security posture (what attack surface is inside the image), operational predictability (whether the image runs the same way in every environment), and supply chain trustworthiness (whether the dependencies inside are known and audited).

This page is about those design choices — what to put in an image, how to build it reproducibly, how to scan and harden it, and how to operate images safely in production. The mechanics of container orchestration (how images are scheduled and scaled) belong in the infrastructure layer; this page is about the image itself.

For the infrastructure that runs containers, see Infrastructure as Code. For how container images move through the pipeline, see CI/CD Pipelines. For image vulnerability scanning as a quality gate, see Static Analysis.


Why It Matters

The image is the attack surface. A container image that includes a full operating system, a package manager, debugging tools, and build toolchain has a large vulnerability surface. An image that includes only the application binary and its runtime dependencies is dramatically smaller. Every package in the image is a package that could contain a vulnerability.

Unpinned dependencies are a reproducibility risk. An image built from FROM node:latest and RUN apt-get install libssl-dev today may behave differently from one built tomorrow, because both tags are moving targets. Reproducible builds require pinned base images and locked dependency versions.

Image size affects deployment speed. A 2GB image takes significantly longer to pull than a 200MB one — and this matters most during incident response, when the team is trying to scale up or roll back under pressure.

Images that run as root are a privilege escalation waiting for a vulnerability. A container vulnerability combined with root inside the container is a direct path to container escape and host-level access. Running containers as non-root limits the blast radius of every vulnerability that reaches runtime.


Standards & Best Practices

Use minimal base images

The base image determines the default attack surface. The hierarchy:

Base image typeWhen to use
Distroless (e.g. gcr.io/distroless/)Default for production; contains only the runtime — no shell, no package manager, no utilities
Alpine (e.g. alpine:3.x)When a shell or minimal utilities are genuinely needed at runtime
Slim variants (e.g. python:3.12-slim)When distroless is not practical for the runtime; avoid full debian/ubuntu
Full OS imageDevelopment and CI stages only; never in the final production image

Every layer added on top of the base image adds attack surface. The bar for adding a package is: "is this required for the application to run?" — not "is it useful to have?"

Pin base image versions precisely

# Wrong — tag is a moving target
FROM node:20

# Wrong — digest-less pin can be updated without you knowing
FROM node:20.11.1-alpine3.19

# Correct — pinned by digest; guaranteed identical on every build
FROM node:20.11.1-alpine3.19@sha256:<digest>

Digest-pinned base images ensure the image built in CI is bit-for-bit identical to the one built three months later. Tools like Dependabot and Renovate can automate digest updates so pinning does not mean falling behind on security patches.

Multi-stage builds separate build toolchain from runtime image

A multi-stage build uses one stage to compile or build the application and a separate, minimal final stage to package the output. The build toolchain — compilers, package managers, test frameworks — never enters the production image.

# Stage 1: build
FROM golang:1.22 AS builder
WORKDIR /app
COPY . .
RUN go build -o /app/service ./cmd/service

# Stage 2: minimal runtime image
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/service /service
ENTRYPOINT ["/service"]

Benefits: the final image is minimal (often 10–30x smaller than the build stage), has no build toolchain vulnerabilities, and cannot be used to compile further code if compromised.

Run as a non-root user

Containers run as root by default. Explicitly specify a non-privileged user in the Dockerfile:

# Create a system user
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser

# Switch to non-root before the entrypoint
USER appuser

Verify this with USER instructions in the final stage. For distroless images, use the nonroot variant (gcr.io/distroless/static-debian12:nonroot).

Running as root is never a valid default in a production image. If the application requires a capability that normally requires root (e.g. binding to port 80), use Linux capabilities to grant only that specific capability rather than full root access.

Image contents must be deterministic

Reproducibility standards:

  • Dependency versions are locked at build time (package-lock.json, poetry.lock, go.sum, requirements.txt with pinned versions)
  • No apt-get install without pinned package versions in the final stage
  • COPY instructions copy specific files, not the full context (COPY src/ ./src/ not COPY . .)
  • .dockerignore excludes build artifacts, development files, and sensitive files (.env, test fixtures, git history)

Every image is scanned for known vulnerabilities

Image scanning runs as a mandatory step in the CI pipeline before an image is pushed to the registry or deployed. Scanners compare the image's software bill of materials (SBOM) against known vulnerability databases.

Standards:

  • Scanning is automated and runs on every image build
  • Critical and high CVEs block the build; medium and low generate tickets
  • Base image updates are automated (Dependabot/Renovate for Dockerfiles) so vulnerability fixes are applied within a defined SLA
  • The SBOM is stored alongside the image as a provenance artefact

The goal is not zero vulnerabilities at any given moment — it is a known, tracked state with a defined remediation process.

Image registry hygiene

The container registry is part of the supply chain:

  • Images are tagged with immutable tags (commit SHA, build number) in addition to mutable semver tags; deployments reference immutable tags
  • Old and unused images are cleaned up with registry lifecycle policies — unbounded image accumulation inflates storage cost and makes audit harder
  • Registry access is restricted — not every engineer needs push access to production image repositories
  • Image signing (e.g. Sigstore/Cosign) provides provenance: a signed image can be verified to have been produced by the authorised CI pipeline

How to Implement

Dockerfile review checklist

Before merging a new or updated Dockerfile:

  • Base image is pinned by digest or exact version
  • Multi-stage build separates build toolchain from runtime image
  • Final stage is minimal (distroless, alpine, or slim)
  • Application runs as a non-root user
  • No secrets in the image (credentials, tokens, .env files)
  • .dockerignore is present and excludes build artifacts and sensitive files
  • Dependencies are locked (lock file present and committed)
  • HEALTHCHECK instruction is present for long-running services

Security properties that require orchestrator configuration

Some security properties cannot be set in the Dockerfile alone — they require orchestrator-level configuration (Kubernetes, ECS, etc.):

  • Read-only root filesystem — prevents runtime modification of the container's filesystem
  • No privilege escalationallowPrivilegeEscalation: false prevents a process from gaining more privileges than its parent
  • Capability dropping — drop all capabilities (drop: ["ALL"]) and add back only those required
  • Resource limits — CPU and memory limits prevent a misbehaving container from starving neighbours

These belong in the Kubernetes Pod spec or equivalent, not in the Dockerfile. They are part of the security posture but outside the image artifact itself.


Common Pitfalls

FROM ubuntu:latest as a base image. Full OS base images include hundreds of packages, most of which the application never uses, all of which represent attack surface. Start from distroless or alpine.

Build tools in the final image. A Python application whose production image includes pip, gcc, and a development apt cache has a much larger attack surface than one that has only the application and its installed packages. Multi-stage builds eliminate this.

Secrets baked into image layers. A RUN instruction that sets an environment variable with a secret value, even if overridden or deleted in a later layer, is stored in the image history and can be extracted. Secrets must never enter the build context. Inject them at runtime via environment variables, mounted secrets, or a secrets manager.

No .dockerignore. Building without a .dockerignore copies the entire build context into the image, including .git directories (which contain the full commit history), .env files, test fixtures, and build artifacts. .dockerignore is not optional.

Mutable tags for deployments. Deploying image:latest in production means the image that runs in production is whichever image was tagged latest at the time of the last pull. Roll back does not reliably return to the previous version. Always deploy by immutable tag (commit SHA or build ID).

Ignoring scan results. Image scanning that is configured but whose results are never acted upon is scanner theatre. Scan results must feed into a defined remediation process, not a backlog that is never prioritised.