eBPF and Cilium Tutorial 2026: Zero-Overhead Kubernetes Networking, Hubble, and Tetragon

eBPF and Cilium Tutorial 2026: Zero-Overhead Kubernetes Networking, Hubble, and Tetragon

Modern Kubernetes clusters carry thousands of pods, dozens of services, and compliance requirements that simply did not exist when iptables was designed. This tutorial walks you through replacing the traditional Linux networking stack with eBPF-powered Cilium, enabling real-time flow visibility with Hubble, and adding kernel-level runtime security with Tetragon — all on a production-grade cluster.

What Is eBPF?

eBPF (Extended Berkeley Packet Filter) lets you run sandboxed programs inside the Linux kernel without writing a kernel module or patching kernel source code. Think of it as a safe, programmable hook that attaches to kernel events — network packets, system calls, file operations — and reacts to them at near-native speed.

Before a program runs in the kernel, the eBPF verifier checks it exhaustively: no unbounded loops, no out-of-bounds memory access, no ability to crash the kernel. Once verified, the JIT compiler turns the bytecode into native machine instructions. The result is code that runs in the same address space as the kernel itself, with none of the overhead of context switching to user space.

Key eBPF primitives you will encounter with Cilium:

  • Maps — typed key/value stores shared between kernel and user space (hash maps, LRU maps, per-CPU arrays).
  • Programs — small functions attached to hooks such as XDP (eXpress Data Path, pre-driver), tc (traffic control), and kprobe/tracepoint for system calls.
  • Tail calls — allow programs to chain into other eBPF programs, keeping each function within the verifier's complexity limit.

Why eBPF Beats iptables

iptables was designed for host firewalls, not for clusters with 10,000 pods and services that change every few seconds. The problems are structural:

  • O(n) rule traversal. Every packet walks a linear chain of rules. A cluster with 5,000 services generates tens of thousands of iptables rules. Latency grows linearly; connection setup time spikes.
  • No connection-level visibility. iptables can match packets but cannot tell you which pod initiated a TCP flow, what HTTP method was used, or how long a request took.
  • Conntrack contention. The kernel connection-tracking table (conntrack) is a global lock under heavy traffic. At scale, this becomes a bottleneck that no amount of CPU can fix.

eBPF solves each of these:

  • O(1) hash-map lookups. Cilium encodes policy decisions into BPF hash maps. A policy lookup is a single map read regardless of how many rules exist in the cluster.
  • Full network visibility. eBPF programs run at the socket layer, the TC layer, and the XDP layer. They see L3, L4, and — via user-space proxying — L7 metadata simultaneously.
  • Lock-free per-CPU maps. Per-CPU BPF maps eliminate global lock contention and scale linearly with core count.

Cilium: The eBPF-Native CNI

Cilium is the Container Network Interface (CNI) plugin that translates Kubernetes networking semantics into eBPF programs. It is now the default CNI on Google Kubernetes Engine (GKE Autopilot), the recommended CNI for Amazon EKS, and the built-in option for Azure AKS. The CNCF graduated Cilium in 2023, marking it as production-ready at any scale.

Cilium handles:

  • Pod-to-pod routing via eBPF instead of kube-proxy DNAT rules.
  • NetworkPolicy and CiliumNetworkPolicy enforcement at L3, L4, and L7.
  • Transparent encryption with WireGuard or IPSec.
  • Multi-cluster mesh through Cluster Mesh.
  • Observability through Hubble (built-in) and Tetragon (security enforcement).

Step 1 — Install Cilium with Helm (kube-proxy Replacement)

Before installing, ensure your cluster was bootstrapped without kube-proxy. For kubeadm, pass --skip-phases=addon/kube-proxy. For managed clusters, consult provider documentation.

Add the Cilium Helm repository and install:

helm repo add cilium https://helm.cilium.io/
helm repo update

helm install cilium cilium/cilium \
  --version 1.16.0 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=<API_SERVER_IP> \
  --set k8sServicePort=6443 \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

Replace <API_SERVER_IP> with the address your nodes use to reach the Kubernetes API server (often the control-plane node IP or a load-balancer IP).

The kubeProxyReplacement=true flag tells Cilium to handle all ClusterIP, NodePort, and LoadBalancer service translation via eBPF — eliminating kube-proxy entirely.

Step 2 — Verify Cilium Status

Install the Cilium CLI:

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
curl -L --remote-name-all \
  https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

Check overall health:

cilium status --wait

You should see all Cilium pods OK, kube-proxy replacement Strict, and Hubble relay OK.

Run the built-in connectivity test (creates temporary namespaces with echo servers and clients):

cilium connectivity test

A passing test suite confirms that pod-to-pod, pod-to-service, and egress traffic all work correctly through the eBPF datapath.

Step 3 — NetworkPolicy with Cilium (L3/L4/L7)

Block All Ingress by Default

Apply a default-deny policy to a namespace before adding allow rules:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress

Allow Specific Pods on Specific Ports

Permit the frontend pods to reach backend pods on port 8080:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 8080

L7 HTTP Policy with CiliumNetworkPolicy

Standard NetworkPolicy is limited to L3/L4. CiliumNetworkPolicy extends enforcement to L7. Allow only HTTP GET requests to /api/v1/health from the monitoring namespace:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-get-health-only
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: backend
  ingress:
    - fromEndpoints:
        - matchLabels:
            k8s:io.kubernetes.pod.namespace: monitoring
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: /api/v1/health

Cilium enforces this rule with an Envoy-based L7 proxy that is inserted transparently; no sidecar injection is required.

Step 4 — Hubble: Real-Time Network Flow Visibility

Hubble is Cilium's built-in observability layer. It records every network flow processed by eBPF and exposes them via gRPC, a CLI, and a web UI.

Hubble was already enabled in the Helm install above. Verify the relay is running:

kubectl -n kube-system get pods -l app.kubernetes.io/name=hubble-relay

Watch Live Traffic with hubble observe

Port-forward the Hubble relay and use the CLI:

kubectl port-forward -n kube-system svc/hubble-relay 4245:80 &
hubble observe --follow

Filter to a specific namespace and only dropped flows:

hubble observe --namespace production --verdict DROPPED --follow

Each line shows the source pod, destination pod, L4 protocol, L7 method/path (when available), and the policy verdict (FORWARDED, DROPPED, REDIRECTED).

Hubble UI

Port-forward the UI service and open it in a browser:

kubectl port-forward -n kube-system svc/hubble-ui 12000:80

Navigate to http://localhost:12000. Select a namespace to see a live service-dependency graph with per-flow drill-down. This is the fastest way to identify which microservices are talking to each other and which flows are being silently dropped by policy.

Step 5 — Tetragon: Runtime Security Enforcement

Tetragon is a Cilium sub-project that uses eBPF to observe and — critically — block malicious activity at the system-call level. Unlike user-space admission controllers, Tetragon acts inside the kernel: there is no way for a container process to bypass it.

Install Tetragon:

helm install tetragon cilium/tetragon \
  --namespace kube-system \
  --set tetragon.grpc.address=localhost:54321

TracingPolicy: Alert on /etc/shadow Reads

A TracingPolicy defines which kernel hooks to instrument and what action to take (observe, override, or block). The following policy generates a Tetragon event whenever any process inside a pod attempts to open /etc/shadow:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: detect-shadow-read
spec:
  kprobes:
    - call: security_file_open
      syscall: false
      args:
        - index: 0
          type: file
      selectors:
        - matchArgs:
            - index: 0
              operator: Prefix
              values:
                - /etc/shadow
          matchActions:
            - action: Sigkill

The Sigkill action sends SIGKILL to the offending process immediately — no user-space round-trip, no race condition. Stream Tetragon events with:

kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon \
  -c export-stdout | tetra getevents -o compact

eBPF vs iptables Performance Comparison

MetriciptableseBPF (Cilium)
Service lookup complexityO(n) chain traversalO(1) hash-map lookup
Rules for 5,000 services~50,000 iptables rules~5,000 BPF map entries
Connection setup latency (p99)2–10 ms at scale< 0.5 ms
Connection trackingGlobal conntrack table (lock)Per-CPU BPF maps (lock-free)
L7 visibilityNone (requires sidecar)Native via eBPF + Envoy
Policy enforcement layersL3/L4 onlyL3 / L4 / L7
Kernel module requiredNoNo (eBPF bytecode only)
kube-proxy dependencyRequiredOptional (full replacement)

FAQ

Do I need a custom kernel? Cilium requires Linux 4.19+ for full functionality; 5.10+ is recommended for Tetragon TracingPolicy with blocking actions. All major cloud provider node images ship a compatible kernel.

Can I migrate an existing cluster from kube-proxy to Cilium? Yes, but it requires draining nodes one at a time. Delete kube-proxy DaemonSet after Cilium is healthy. Cilium's documentation provides a node-by-node migration guide that keeps the cluster available throughout.

Does Cilium replace a service mesh like Istio? For most use cases, yes. Cilium handles mTLS (via WireGuard), L7 policy, and observability without sidecars. Istio adds richer traffic management (retries, circuit breaking, weighted routing) that Cilium does not yet fully replicate.

What happens to NetworkPolicy objects I already have? Cilium is fully compatible with the standard networking.k8s.io/v1 NetworkPolicy API. Existing policies continue to work. CiliumNetworkPolicy is an additive extension — you adopt it only for L7 features.

Is Tetragon production-ready? Tetragon reached v1.0 in 2024 and is used in production by multiple large enterprises. The Sigkill action should be tested in staging first; start with action: Post (observe only) and promote to blocking after validating false-positive rates.

Leonardo Lazzaro

Software engineer and technical writer. 10+ years experience in DevOps, Python, and Linux systems.

More articles by Leonardo Lazzaro