eBPF and Cilium Tutorial 2026: Zero-Overhead Kubernetes Networking, Hubble, and Tetragon
Modern Kubernetes clusters carry thousands of pods, dozens of services, and compliance requirements that simply did not exist when iptables was designed. This tutorial walks you through replacing the traditional Linux networking stack with eBPF-powered Cilium, enabling real-time flow visibility with Hubble, and adding kernel-level runtime security with Tetragon — all on a production-grade cluster.
What Is eBPF?
eBPF (Extended Berkeley Packet Filter) lets you run sandboxed programs inside the Linux kernel without writing a kernel module or patching kernel source code. Think of it as a safe, programmable hook that attaches to kernel events — network packets, system calls, file operations — and reacts to them at near-native speed.
Before a program runs in the kernel, the eBPF verifier checks it exhaustively: no unbounded loops, no out-of-bounds memory access, no ability to crash the kernel. Once verified, the JIT compiler turns the bytecode into native machine instructions. The result is code that runs in the same address space as the kernel itself, with none of the overhead of context switching to user space.
Key eBPF primitives you will encounter with Cilium:
- Maps — typed key/value stores shared between kernel and user space (hash maps, LRU maps, per-CPU arrays).
- Programs — small functions attached to hooks such as
XDP(eXpress Data Path, pre-driver),tc(traffic control), andkprobe/tracepointfor system calls. - Tail calls — allow programs to chain into other eBPF programs, keeping each function within the verifier's complexity limit.
Why eBPF Beats iptables
iptables was designed for host firewalls, not for clusters with 10,000 pods and services that change every few seconds. The problems are structural:
- O(n) rule traversal. Every packet walks a linear chain of rules. A cluster with 5,000 services generates tens of thousands of
iptablesrules. Latency grows linearly; connection setup time spikes. - No connection-level visibility.
iptablescan match packets but cannot tell you which pod initiated a TCP flow, what HTTP method was used, or how long a request took. - Conntrack contention. The kernel connection-tracking table (
conntrack) is a global lock under heavy traffic. At scale, this becomes a bottleneck that no amount of CPU can fix.
eBPF solves each of these:
- O(1) hash-map lookups. Cilium encodes policy decisions into BPF hash maps. A policy lookup is a single map read regardless of how many rules exist in the cluster.
- Full network visibility. eBPF programs run at the socket layer, the TC layer, and the XDP layer. They see L3, L4, and — via user-space proxying — L7 metadata simultaneously.
- Lock-free per-CPU maps. Per-CPU BPF maps eliminate global lock contention and scale linearly with core count.
Cilium: The eBPF-Native CNI
Cilium is the Container Network Interface (CNI) plugin that translates Kubernetes networking semantics into eBPF programs. It is now the default CNI on Google Kubernetes Engine (GKE Autopilot), the recommended CNI for Amazon EKS, and the built-in option for Azure AKS. The CNCF graduated Cilium in 2023, marking it as production-ready at any scale.
Cilium handles:
- Pod-to-pod routing via eBPF instead of kube-proxy DNAT rules.
NetworkPolicyandCiliumNetworkPolicyenforcement at L3, L4, and L7.- Transparent encryption with WireGuard or IPSec.
- Multi-cluster mesh through Cluster Mesh.
- Observability through Hubble (built-in) and Tetragon (security enforcement).
Step 1 — Install Cilium with Helm (kube-proxy Replacement)
Before installing, ensure your cluster was bootstrapped without kube-proxy. For kubeadm, pass --skip-phases=addon/kube-proxy. For managed clusters, consult provider documentation.
Add the Cilium Helm repository and install:
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium \
--version 1.16.0 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=<API_SERVER_IP> \
--set k8sServicePort=6443 \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
Replace <API_SERVER_IP> with the address your nodes use to reach the Kubernetes API server (often the control-plane node IP or a load-balancer IP).
The kubeProxyReplacement=true flag tells Cilium to handle all ClusterIP, NodePort, and LoadBalancer service translation via eBPF — eliminating kube-proxy entirely.
Step 2 — Verify Cilium Status
Install the Cilium CLI:
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
curl -L --remote-name-all \
https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
Check overall health:
cilium status --wait
You should see all Cilium pods OK, kube-proxy replacement Strict, and Hubble relay OK.
Run the built-in connectivity test (creates temporary namespaces with echo servers and clients):
cilium connectivity test
A passing test suite confirms that pod-to-pod, pod-to-service, and egress traffic all work correctly through the eBPF datapath.
Step 3 — NetworkPolicy with Cilium (L3/L4/L7)
Block All Ingress by Default
Apply a default-deny policy to a namespace before adding allow rules:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
Allow Specific Pods on Specific Ports
Permit the frontend pods to reach backend pods on port 8080:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
L7 HTTP Policy with CiliumNetworkPolicy
Standard NetworkPolicy is limited to L3/L4. CiliumNetworkPolicy extends enforcement to L7. Allow only HTTP GET requests to /api/v1/health from the monitoring namespace:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-get-health-only
namespace: production
spec:
endpointSelector:
matchLabels:
app: backend
ingress:
- fromEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: monitoring
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: /api/v1/health
Cilium enforces this rule with an Envoy-based L7 proxy that is inserted transparently; no sidecar injection is required.
Step 4 — Hubble: Real-Time Network Flow Visibility
Hubble is Cilium's built-in observability layer. It records every network flow processed by eBPF and exposes them via gRPC, a CLI, and a web UI.
Hubble was already enabled in the Helm install above. Verify the relay is running:
kubectl -n kube-system get pods -l app.kubernetes.io/name=hubble-relay
Watch Live Traffic with hubble observe
Port-forward the Hubble relay and use the CLI:
kubectl port-forward -n kube-system svc/hubble-relay 4245:80 &
hubble observe --follow
Filter to a specific namespace and only dropped flows:
hubble observe --namespace production --verdict DROPPED --follow
Each line shows the source pod, destination pod, L4 protocol, L7 method/path (when available), and the policy verdict (FORWARDED, DROPPED, REDIRECTED).
Hubble UI
Port-forward the UI service and open it in a browser:
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
Navigate to http://localhost:12000. Select a namespace to see a live service-dependency graph with per-flow drill-down. This is the fastest way to identify which microservices are talking to each other and which flows are being silently dropped by policy.
Step 5 — Tetragon: Runtime Security Enforcement
Tetragon is a Cilium sub-project that uses eBPF to observe and — critically — block malicious activity at the system-call level. Unlike user-space admission controllers, Tetragon acts inside the kernel: there is no way for a container process to bypass it.
Install Tetragon:
helm install tetragon cilium/tetragon \
--namespace kube-system \
--set tetragon.grpc.address=localhost:54321
TracingPolicy: Alert on /etc/shadow Reads
A TracingPolicy defines which kernel hooks to instrument and what action to take (observe, override, or block). The following policy generates a Tetragon event whenever any process inside a pod attempts to open /etc/shadow:
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: detect-shadow-read
spec:
kprobes:
- call: security_file_open
syscall: false
args:
- index: 0
type: file
selectors:
- matchArgs:
- index: 0
operator: Prefix
values:
- /etc/shadow
matchActions:
- action: Sigkill
The Sigkill action sends SIGKILL to the offending process immediately — no user-space round-trip, no race condition. Stream Tetragon events with:
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon \
-c export-stdout | tetra getevents -o compact
eBPF vs iptables Performance Comparison
| Metric | iptables | eBPF (Cilium) |
|---|---|---|
| Service lookup complexity | O(n) chain traversal | O(1) hash-map lookup |
| Rules for 5,000 services | ~50,000 iptables rules | ~5,000 BPF map entries |
| Connection setup latency (p99) | 2–10 ms at scale | < 0.5 ms |
| Connection tracking | Global conntrack table (lock) | Per-CPU BPF maps (lock-free) |
| L7 visibility | None (requires sidecar) | Native via eBPF + Envoy |
| Policy enforcement layers | L3/L4 only | L3 / L4 / L7 |
| Kernel module required | No | No (eBPF bytecode only) |
| kube-proxy dependency | Required | Optional (full replacement) |
FAQ
Do I need a custom kernel? Cilium requires Linux 4.19+ for full functionality; 5.10+ is recommended for Tetragon TracingPolicy with blocking actions. All major cloud provider node images ship a compatible kernel.
Can I migrate an existing cluster from kube-proxy to Cilium? Yes, but it requires draining nodes one at a time. Delete kube-proxy DaemonSet after Cilium is healthy. Cilium's documentation provides a node-by-node migration guide that keeps the cluster available throughout.
Does Cilium replace a service mesh like Istio? For most use cases, yes. Cilium handles mTLS (via WireGuard), L7 policy, and observability without sidecars. Istio adds richer traffic management (retries, circuit breaking, weighted routing) that Cilium does not yet fully replicate.
What happens to NetworkPolicy objects I already have? Cilium is fully compatible with the standard networking.k8s.io/v1 NetworkPolicy API. Existing policies continue to work. CiliumNetworkPolicy is an additive extension — you adopt it only for L7 features.
Is Tetragon production-ready? Tetragon reached v1.0 in 2024 and is used in production by multiple large enterprises. The Sigkill action should be tested in staging first; start with action: Post (observe only) and promote to blocking after validating false-positive rates.