Security Architecture
Design-level security principles — how systems are structured to limit damage, enforce least privilege, and withstand compromise.
Overview
Security architecture is concerned with how systems are designed to resist attack and limit the consequences when something is compromised. It is distinct from code-level security (which tools like static analysis and dependency scanning address) and from operational security (which secrets management and access controls address). Design-level security asks: given that some component will eventually be breached, how do we ensure that breach cannot become a catastrophe?
The perspective is structural. Every authentication model, authorisation boundary, service-to-service trust decision, and data classification creates (or fails to create) a security property the system will rely on for its lifetime. These decisions are expensive to change after the system has users and data.
For code-level security tooling, see Static Analysis. For managing secrets and credentials, see Secrets Management. For capturing the security decisions as architecture records, see Architecture Decisions.
Why It Matters
Security that is bolted on after the fact is security theatre. A system designed without authentication boundaries, with implicit trust between components, and without a data classification scheme cannot have meaningful security added later without being substantially rebuilt. Security architecture is a first-class design concern, not a hardening checklist applied post-launch.
Breach is inevitable; blast radius is a design choice. No system with internet access can guarantee it will never be compromised. What can be designed is how much damage a breach of any one component can do. A well-segmented system limits breach impact to the compromised component. A flat, implicitly trusted system means a breach of any component is a breach of all data.
AuthN and AuthZ failures are the most common critical vulnerabilities. Broken authentication and broken access control consistently rank at the top of the OWASP list for a reason: they are common, catastrophic, and often introduced at the design level rather than the code level. Getting these right at design time prevents an entire class of vulnerability.
Compliance requirements are increasingly design-level. SOC 2, GDPR, HIPAA, and PCI DSS all impose design-level requirements (data isolation, audit logging, encryption at rest, access control) that cannot be retrofitted without significant rework. Understanding the compliance surface of a system before it is built saves months of remediation work.
Standards & Best Practices
Authentication: verify identity at every boundary
Authentication (AuthN) establishes who is making a request. The standards:
- Every entry point to the system has explicit authentication — there are no "internal-only" routes exposed externally without authentication gates
- Authentication uses established standards (OAuth 2.0, OIDC, SAML) rather than bespoke token schemes; bespoke schemes almost always have implementation flaws
- Credentials are never transmitted in URLs (query parameters are logged) — they belong in headers or cookies with appropriate flags
- Session tokens are short-lived, rotatable, and revocable — a stolen session token should be usable for minutes, not months
- Multi-factor authentication is required for administrative access and for any operation with outsized blast radius
Service-to-service authentication (workload identity) uses short-lived certificates or platform-issued tokens (e.g. SPIFFE/SPIRE, AWS instance roles, GCP Workload Identity), not static API keys shared between services. A static shared key between services is a credential that rotates when someone remembers, which is never.
Authorisation: enforce least privilege at every layer
Authorisation (AuthZ) establishes what an authenticated identity is allowed to do. The principles:
Least privilege. Every identity (user, service, background job) has exactly the permissions needed for its function — no more. A background job that reads user profiles should not have write permissions on financial records. Least privilege limits the blast radius of a compromised credential.
Explicit allow, implicit deny. The default answer to "is this allowed?" is no. Permissions are granted; everything not explicitly granted is denied. Systems built on implicit allow (where access is assumed unless explicitly blocked) fail open — missing a block allows access it should not.
Defence in depth for authorisation. Authorisation is checked at the API gateway, at the service boundary, and at the data layer. A single authorisation check at the perimeter is a single point of failure; bypassing it (via internal routing, a different service, a direct database access) gives full access.
Authorisation logic lives in one place. Scattered authorisation checks across handlers, business logic, and data access layers diverge. A centralised authorisation layer (policy enforcement point) ensures changes to permission rules apply everywhere.
Audit log every access control decision. Not just failures — successes too. The audit log is how an investigation discovers what was accessed after a breach. Without it, the investigation starts from zero.
Zero-trust: trust nothing implicitly
Zero-trust is an architecture principle: no request is trusted by virtue of its origin. A request from inside the private network is treated with the same scepticism as one from the public internet. The practical implications:
- Authenticate every service-to-service call. Being inside the VPC does not confer trust. Every service that receives a request verifies the caller's identity.
- Encrypt in transit everywhere. Mutual TLS (mTLS) between services ensures that both endpoints are who they claim to be, and that traffic is not readable in transit — even between services on the same internal network.
- No long-lived credentials. Credentials should expire and be rotated automatically. A credential that does not expire is a credential that is assumed to be in attacker hands at some point.
- Access is per-session, not per-network. A user or service that is authenticated for one session does not carry that authentication permanently. Each session is independently verified.
Zero-trust is not a product; it is a posture. It does not require any specific technology — it requires consistent application of the principles above.
Threat modeling: design security before writing code
Threat modeling is the practice of systematically identifying what an attacker might try to do, and designing defences before the system is built. The STRIDE framework is a common starting point:
| Category | What the attacker does | Design response |
|---|---|---|
| Spoofing | Impersonates a user or service | Strong authentication |
| Tampering | Modifies data in transit or at rest | Integrity checks, encryption |
| Repudiation | Denies performing an action | Non-repudiable audit logging |
| Information disclosure | Accesses data they should not | Authorisation, least privilege, encryption at rest |
| Denial of service | Makes the service unavailable | Rate limiting, circuit breakers, capacity planning |
| Elevation of privilege | Gains more permissions than intended | Principle of least privilege, explicit AuthZ checks |
A threat model for a new service is a short document, not an extensive exercise. Identify the most significant threats, document the design decisions that address them, and revisit when the system changes substantially. A system with no threat model has implicitly assumed all threats are acceptable.
Data classification drives protection requirements
Not all data requires the same protection. Data classification establishes tiers that map to concrete handling requirements:
| Classification | Examples | Requirements |
|---|---|---|
| Public | Marketing content, public docs | No special requirements |
| Internal | Business documents, internal metrics | Access restricted to employees |
| Confidential | Customer PII, financial data, credentials | Encrypted at rest and in transit; access audited; minimal retention |
| Restricted | Health records, payment card data, cryptographic keys | Strongest controls; regulatory-specific requirements |
Classification should happen at data design time, not retroactively. The classification of data determines where it can be stored, who can access it, how long it is retained, and what happens when it is deleted.
Security by design for common vulnerabilities
Several vulnerability classes are almost entirely preventable by design:
- SQL injection — Use parameterised queries or an ORM that parameterises by default. Never construct SQL by string concatenation.
- Cross-site scripting (XSS) — Use a framework that HTML-escapes output by default. Mark trusted HTML explicitly.
- CSRF — Use SameSite cookies, CSRF tokens, or request-signing for state-changing operations.
- Insecure direct object reference (IDOR) — Never expose internal database IDs directly in URLs or responses. Use opaque identifiers and verify authorisation on every resource access.
- Mass assignment — Explicitly allowlist the fields that can be set via API input; never pass request payloads directly to a data model.
These are not edge cases. They are the most common vulnerability classes in production systems. They are prevented by designing the data flow correctly, not by hoping developers remember rules.
How to Implement
Security review at design time
For any new service or significant feature, the design review (see Design Review Frameworks) should include:
- What data does this system store or process, and how is it classified?
- What authentication is required at each entry point?
- What authorisation model is used — who can do what, and where is it enforced?
- What is the blast radius of a breach of any one component?
- Has a threat model (even a brief one) been produced?
- What are the compliance requirements, and are they met by the design?
Security review is not a gate applied after implementation. It is part of design.
Minimum security properties for any internet-facing service
- Authentication on every endpoint (no unauthenticated endpoints unless explicitly public)
- HTTPS everywhere with modern TLS configuration
- Input validation at every entry point
- Rate limiting on authentication and high-cost endpoints
- Audit logging for authentication events and access control decisions
- Dependencies scanned for known vulnerabilities (see Dependency Management)
Common Pitfalls
"Internal-only" as a security model. The VPN is not a security boundary for zero-trust systems. Internal services that assume intra-network requests are trusted will fail when any service in the network is compromised.
AuthZ in the UI only. Authorisation checks in the UI that are not enforced at the API layer are not authorisation — they are decoration. An attacker who calls the API directly bypasses all of them.
Logging failures but not successes. An audit log that only records failures tells you what the attacker tried; it does not tell you what they succeeded at. Log access control decisions for sensitive data, not just failures.
Over-privileged service accounts. A service account with admin-level database access for a service that only reads two tables. The account is a high-value target; it should be scoped to the minimum it needs to function.
Threat modeling as a one-time exercise. A threat model produced at initial design becomes stale as the system grows. Revisit it when new services, data types, or user roles are introduced.
Security as a checklist at the end. Treating security as a review step applied after the system is designed and built means security findings become expensive rework rather than cheap design changes. Security considerations must be present from the first design conversation.