Zero Trust Architecture for Developers 2026: mTLS, Short-Lived JWTs, Vault, and Network Policies
97% of organizations are now adopting Zero Trust Architecture (ZTA). Most tutorials on the subject are written for enterprise architects with dedicated security teams and six-figure tool budgets. This guide is for the five-person dev team running services on Kubernetes who wants real Zero Trust — implemented this week, with open-source tools, and without hiring a consultant.
You will implement mTLS with cert-manager, short-lived tokens with HashiCorp Vault, least-privilege networking with Kubernetes NetworkPolicy, and workload identity with SPIFFE/SPIRE. No proprietary software required.
What Zero Trust Actually Means
The phrase "never trust, always verify" sounds like a vendor slogan, but it captures something precise: every request must be authenticated, authorized, and encrypted — regardless of where it originates. A request from a pod on the same Kubernetes node gets the same scrutiny as one arriving from the internet.
ZTA is not a product. It is a set of principles applied to your architecture. You can implement it with tools you already have.
The three core tenets from NIST SP 800-207:
- Verify identity — every service, user, and device proves who it is on every request.
- Enforce least privilege — access is scoped to exactly what is needed, nothing more.
- Assume breach — design as if the perimeter is already compromised; limit blast radius.
VPN vs. Zero Trust
A VPN creates a trusted internal network. Once a workload is inside the VPN, lateral movement is largely unconstrained. Zero Trust eliminates the concept of a trusted network entirely. Being on the same subnet grants no implicit access.
| VPN model | Zero Trust model | |
|---|---|---|
| Trust boundary | Network perimeter | Per-request identity |
| Lateral movement | Easy once inside | Blocked by default |
| Credential theft impact | Full network access | Limited to scoped token |
| Service-to-service auth | Usually none | Mandatory (mTLS/JWT) |
The Six ZTA Pillars (NIST SP 800-207)
| Pillar | What it means | Open-source tool |
|---|---|---|
| Identity | Every workload has a cryptographic identity | SPIFFE/SPIRE, cert-manager |
| Device | Workloads run on attested, known infrastructure | SPIRE node attestation |
| Network | Traffic is encrypted and flows are restricted | Kubernetes NetworkPolicy, mTLS |
| Application | Auth enforced at the application layer, not just the network | JWT validation, OPA |
| Data | Secrets are short-lived and dynamically issued | HashiCorp Vault |
| Visibility | Every access is logged and auditable | Audit logging, Loki, Falco |
This tutorial covers Pillars 1 through 5 with concrete implementation steps.
Pillar 1: mTLS Between Microservices
Mutual TLS means both sides of a connection present certificates. Your payments service does not just verify the server certificate of inventory — inventory also verifies that payments is who it claims to be.
Install cert-manager
cert-manager automates certificate issuance and rotation inside Kubernetes.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.0/cert-manager.yaml
kubectl wait --for=condition=Available deployment --all -n cert-manager --timeout=90s
Create a Self-Signed Cluster CA
# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-ca
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: internal-ca-cert
namespace: cert-manager
spec:
isCA: true
commonName: internal-ca
secretName: internal-ca-tls
privateKey:
algorithm: ECDSA
size: 256
issuerRef:
name: internal-ca
kind: ClusterIssuer
group: cert-manager.io
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-issuer
spec:
ca:
secretName: internal-ca-tls
kubectl apply -f cluster-issuer.yaml
Issue a Certificate for Each Service
# payments-cert.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: payments-tls
namespace: production
spec:
secretName: payments-tls-secret
duration: 24h
renewBefore: 1h
dnsNames:
- payments.production.svc.cluster.local
issuerRef:
name: internal-issuer
kind: ClusterIssuer
kubectl apply -f payments-cert.yaml
Enforce mTLS in Application Code (Python)
Mount the certificate secret into your pod and configure ssl.SSLContext to require client certificates.
import ssl
import http.server
context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
context.verify_mode = ssl.CERT_REQUIRED
context.load_verify_locations("/etc/tls/ca.crt")
context.load_cert_chain("/etc/tls/tls.crt", "/etc/tls/tls.key")
server = http.server.HTTPServer(("0.0.0.0", 8443), http.server.BaseHTTPRequestHandler)
server.socket = context.wrap_socket(server.socket, server_side=True)
server.serve_forever()
Client-side, use context.load_cert_chain() with the client certificate. Now both ends verify each other's identity on every TCP connection.
Pillar 2: Short-Lived JWTs with HashiCorp Vault
Long-lived credentials — API keys that never expire, database passwords in .env files — are the most common root cause of breaches. HashiCorp Vault (open-source) eliminates them.
Install Vault with Helm
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault \
--set "server.dev.enabled=true" \
--namespace vault --create-namespace
Enable the JWT Auth Method
Services authenticate to Vault using their Kubernetes ServiceAccount token, then receive a Vault token valid for 15 minutes.
vault auth enable jwt
vault write auth/jwt/config \
oidc_discovery_url="https://kubernetes.default.svc.cluster.local" \
oidc_discovery_ca_pem=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
vault write auth/jwt/role/payments \
role_type="jwt" \
bound_audiences="vault" \
user_claim="sub" \
bound_subject="system:serviceaccount:production:payments" \
policies="payments-policy" \
ttl="15m"
A Vault token with a 15-minute TTL means a stolen token is useless within one coffee break.
Dynamic Database Secrets
Instead of a static Postgres password, Vault generates a fresh credential for each request and revokes it after use.
vault secrets enable database
vault write database/config/postgres \
plugin_name=postgresql-database-plugin \
allowed_roles="payments-db" \
connection_url="postgresql://{{username}}:{{password}}@postgres:5432/app" \
username="vault-root" \
password="<root-password>"
vault write database/roles/payments-db \
db_name=postgres \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN ENCRYPTED PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT, INSERT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl="1h" \
max_ttl="4h"
Your service calls vault read database/creds/payments-db on startup, gets a unique credential, and Vault revokes it automatically when the TTL expires.
Pillar 3: Kubernetes NetworkPolicy (Least-Privilege Networking)
By default, every pod in Kubernetes can reach every other pod. NetworkPolicy changes that.
Default Deny All
Apply this to every namespace before adding any allow rules.
# default-deny.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
kubectl apply -f default-deny.yaml
Every pod is now isolated. Restore only the connections you need.
Allow Specific Service-to-Service Communication
# allow-payments-to-inventory.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: payments-to-inventory
namespace: production
spec:
podSelector:
matchLabels:
app: inventory
ingress:
- from:
- podSelector:
matchLabels:
app: payments
ports:
- protocol: TCP
port: 8443
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
Unexpected egress — a compromised pod trying to call out to an attacker's server — is now blocked at the kernel level by your CNI plugin (Calico, Cilium, or Flannel with NetworkPolicy support).
Pillar 4: SPIFFE/SPIRE (Workload Identity for VMs and Non-Kubernetes Services)
Not every service runs in Kubernetes. SPIFFE (Secure Production Identity Framework for Everyone) provides a standard workload identity that works on VMs, containers, and bare metal.
SPIRE (the reference implementation) issues SPIFFE Verifiable Identity Documents (SVIDs) — short-lived X.509 certificates or JWTs — to workloads based on platform attestation.
# Deploy SPIRE server
kubectl apply -f https://spiffe.io/docs/latest/try/getting-started-k8s/spire-server.yaml
# Register a workload entry
spire-server entry create \
-spiffeID spiffe://example.org/payments \
-parentID spiffe://example.org/node/worker-1 \
-selector k8s:pod-label:app:payments
The payments workload now has a cryptographic identity (spiffe://example.org/payments) that other services can verify without a shared secret. SVIDs rotate automatically every hour.
Pillar 5: Audit Logging (Who Accessed What, When)
Zero Trust without observability is incomplete. You need a record of every access decision.
Enable Kubernetes audit logging in your API server configuration:
# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: Metadata
omitStages: ["RequestReceived"]
Ship logs to Loki or Elasticsearch. Add Falco for runtime threat detection — it fires an alert when a container opens an unexpected network connection or reads a sensitive file path.
helm install falco falcosecurity/falco \
--set falco.grpc.enabled=true \
--namespace falco --create-namespace
Zero Trust vs. VPN
| Concern | VPN | Zero Trust (this stack) |
|---|---|---|
| Authentication | User login at perimeter | Per-request mTLS + JWT |
| Service-to-service | Implicit trust inside network | cert-manager mTLS required |
| Secret management | Static credentials in config | Vault dynamic secrets (TTL 1h) |
| Credential lifespan | Months to never | 15 min (JWT), 24h (cert) |
| Lateral movement | Unrestricted | Blocked by NetworkPolicy |
| Tooling cost | VPN license | Open-source (free) |
| Operational overhead | Low initially, high after breach | Medium (automation helps) |
ZTA Maturity Model
You do not have to implement everything at once. Move through stages.
Stage 0 — No Zero Trust Static secrets in environment variables. Services trust each other by IP. No audit log. Start here if you are reading this guide for the first time.
Stage 1 — Foundational (start here) - Deploy cert-manager and issue TLS certificates for all services. - Apply default-deny NetworkPolicy to production namespaces. - Move database credentials into Vault with a 4-hour TTL.
Stage 2 — Intermediate - Enable mTLS with client certificate validation between all internal services. - Reduce Vault TTLs to 1 hour. Enable Vault audit log. - Add Kubernetes audit logging. Ship to a log aggregator.
Stage 3 — Advanced - Deploy SPIRE for cross-cluster and VM workload identity. - Enforce JWT validation at every service boundary using an OPA sidecar. - Integrate Falco alerts into your incident response runbook. - Reduce JWT TTL to 15 minutes. Auto-rotate all certificates every 24 hours.
Teams that skip Stage 1 and try to implement Stage 3 all at once typically abandon the project. Ship Stage 1 in a single sprint.
FAQ
Do I need a service mesh (Istio, Linkerd) for mTLS? No. cert-manager plus application-level TLS gives you mTLS without the overhead of a sidecar proxy on every pod. A service mesh adds convenience and observability, but it is not required to start.
What CNI plugin supports NetworkPolicy? Calico, Cilium, and Weave Net all support NetworkPolicy. Flannel does not natively — pair it with Calico for policy enforcement. Most managed Kubernetes offerings (GKE, EKS, AKS) include a compatible CNI.
Is the open-source version of Vault production-ready? Yes. HashiCorp Vault open-source is used in production at thousands of organizations. The enterprise version adds HSM support, automated DR replication, and a UI with RBAC — useful at scale, but not required to start.
How do short-lived tokens affect service restarts? Your service should be configured to fetch a new token on startup and to refresh before expiry. The Vault Agent sidecar handles this automatically, writing a fresh token to a shared volume before the old one expires.
Can I apply ZTA principles to a monolith, not just microservices? Yes. Start with the data pillar: move secrets into Vault. Then add audit logging. NetworkPolicy still applies if the monolith runs in Kubernetes. mTLS applies to any external API calls the monolith makes.
Implementing Zero Trust is not a one-day project, but Stage 1 — cert-manager, default-deny NetworkPolicy, and Vault for secrets — can be running in production within a week. Start there. The principles are sound; the tooling is free; the risk of not starting is real.