OpenTelemetry Python Tutorial 2026: Distributed Tracing, Custom Spans, and Context Propagation

OpenTelemetry is the second most active project in the Cloud Native Computing Foundation (CNCF), trailing only Kubernetes. Its Python SDK is mature, vendor-neutral, and exports to any OTLP-compatible backend — Jaeger, Grafana Tempo, SigNoz, Datadog, or your own collector — without touching instrumentation code. This tutorial focuses on the SDK itself: you will auto-instrument a FastAPI service, add manual spans for business-logic visibility, propagate trace context across two services using W3C headers, pass cross-cutting data with Baggage, and set up the OTel Collector for production fan-out.

1. Why Distributed Tracing?

Logs tell you what happened on a single machine. They cannot tell you what happened to a specific user request that crossed five services. Consider an e-commerce checkout that calls: API Gateway → Order Service → Inventory Service → Payment Service → Notification Service. A user reports "checkout is slow." Your logs show nothing obviously wrong on any individual service. The bottleneck could be a slow Postgres query in Order Service, a retry storm in Payment Service, or network latency between services in different availability zones.

Distributed tracing answers the question logs cannot: for this specific request, where did the time go? A trace is a tree of timed operations — called spans — that all share a single trace_id. The waterfall view in any tracing backend makes bottlenecks immediately visible: a span that occupies 900 ms of a 1,000 ms total trace is the culprit, even when it lives in service three of five.

Metrics give you aggregate rates and percentiles. Logs give you point-in-time event detail. Traces give you causality and duration across process boundaries. All three are complementary — traces are what logs alone cannot provide at the distributed-system level.

2. OpenTelemetry Core Concepts

Before installing anything, it helps to understand the OTel model:

Concept	Description
Trace	The complete end-to-end journey for one request, represented as a tree of spans that share a `trace_id`.
Span	A single timed operation. Has a name, start/end timestamps, attributes (indexed key-value pairs), events (timestamped log-like messages), and a status.
TraceContext	The W3C standard defining how `trace_id` and `span_id` travel between processes — primarily via the `traceparent` HTTP header.
Propagator	Serialises and deserialises trace context into/out of a carrier (HTTP headers, message-queue metadata).
Exporter	Ships completed spans to a backend over OTLP (OpenTelemetry Protocol), Jaeger Thrift, Zipkin, or vendor-specific protocols.
Collector	An optional proxy that receives spans from your application, applies processing (batching, sampling, attribute enrichment), and fans them out to one or more backends.

The Python SDK separates opentelemetry-api from opentelemetry-sdk by design. Libraries instrument against the API; only the application configures the SDK. This means switching backends requires only changing the exporter configuration — no library code changes.

3. Install the OTel SDK

Create a virtual environment and install the necessary packages:

python -m venv .venv
source .venv/bin/activate

# Core SDK, API, and OTLP exporter
pip install opentelemetry-sdk opentelemetry-api opentelemetry-exporter-otlp

# Auto-instrumentation for FastAPI and the requests library
pip install opentelemetry-instrumentation-fastapi opentelemetry-instrumentation-requests

# FastAPI runtime
pip install fastapi uvicorn[standard] requests

Pin versions in requirements.txt. At the time of writing, the current stable release is opentelemetry-sdk==1.25.0.

The opentelemetry-exporter-otlp meta-package installs both gRPC (opentelemetry-exporter-otlp-proto-grpc) and HTTP (opentelemetry-exporter-otlp-proto-http) variants. gRPC over port 4317 is the standard for internal services; HTTP over port 4318 is available when gRPC is blocked by a proxy.

4. Auto-Instrumentation of a FastAPI App

Create service_a.py:

import requests
from fastapi import FastAPI
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource, SERVICE_NAME
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor

# Configure TracerProvider with OTLP exporter
resource = Resource(attributes={SERVICE_NAME: "order-service"})
exporter = OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True)
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)

app = FastAPI()

# Patch FastAPI ASGI middleware — every request becomes a span automatically
FastAPIInstrumentor.instrument_app(app)

# Patch the requests library — every outbound HTTP call becomes a child span
# and the traceparent header is injected automatically
RequestsInstrumentor().instrument()


@app.get("/checkout/{order_id}")
def checkout(order_id: int):
    # This HTTP call is automatically traced and carries traceparent
    response = requests.get(f"http://localhost:8001/inventory/{order_id}")
    return {"order": order_id, "inventory": response.json()}

Run with uvicorn service_a:app --port 8000. Every incoming request now creates a root span containing HTTP method, route, status code, and timing. Every outgoing requests call creates a child span and injects the W3C traceparent header automatically.

BatchSpanProcessor is the correct choice for production: it buffers spans in memory and flushes them in batches off the request critical path. Use SimpleSpanProcessor only during development or tests.

The resource object tags every span from this process with service.name: order-service. Without this tag, data from different services is indistinguishable in the backend.

5. Custom Spans (Manual Instrumentation)

Auto-instrumentation covers framework-level boundaries. For business logic — a complex database query, a third-party API call outside the instrumented libraries, a multi-step pricing calculation — you need manual spans:

from opentelemetry import trace
from opentelemetry.trace import Status, StatusCode

tracer = trace.get_tracer(__name__)


def fetch_user_orders(user_id: int) -> list:
    with tracer.start_as_current_span("db.fetch_user_orders") as span:
        # Attributes are indexed key-value pairs for filtering and grouping
        span.set_attribute("db.system", "postgresql")
        span.set_attribute("db.table", "orders")
        span.set_attribute("user.id", user_id)

        try:
            results = run_heavy_query(user_id)  # your actual DB call

            # Events are timestamped log-like messages within a span
            span.add_event("query_complete", {"row_count": len(results)})
            return results

        except Exception as exc:
            # Mark the span as errored — appears red in the waterfall UI
            span.set_status(Status(StatusCode.ERROR, str(exc)))
            span.record_exception(exc)   # attaches the full stack trace
            raise

start_as_current_span automatically parents the span to whatever span is currently active in the context. If this function is called from inside a FastAPI request handler, this span appears nested under the auto-instrumented HTTP span in the trace waterfall.

Attribute names follow the OTel Semantic Conventions: db.system, http.method, user.id, and so on. Using semantic conventions means your dashboards and queries work across services without per-team naming negotiation.

6. Context Propagation Across Services

Context propagation is what turns isolated spans into a coherent distributed trace. The W3C traceparent header carries the trace ID and parent span ID across process boundaries:

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
             ^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ ^^
             version       trace-id (128-bit)     span-id (64-bit) flags

When RequestsInstrumentor is active, propagation is automatic. For code paths that bypass the instrumented libraries — background workers, message-queue consumers, manual HTTP sessions — do it explicitly:

from opentelemetry.propagate import inject, extract
from fastapi import Request

# SERVICE A: inject context into outgoing headers
def call_service_b(order_id: int) -> dict:
    headers = {}
    inject(headers)          # writes traceparent (and tracestate) into the dict
    response = requests.get(
        f"http://service-b/process/{order_id}",
        headers=headers,
    )
    return response.json()


# SERVICE B: extract context from incoming headers
@app.get("/process/{order_id}")
def process_order(order_id: int, request: Request):
    ctx = extract(dict(request.headers))
    with tracer.start_as_current_span("process_order", context=ctx) as span:
        span.set_attribute("order.id", order_id)
        # ... business logic ...
        return {"status": "processed"}

With FastAPIInstrumentor active in Service B, the extraction happens automatically for all HTTP endpoints — the manual form above is shown for clarity and for non-HTTP entry points.

When you trigger GET /checkout/42 on Service A and then search for that trace in Jaeger, you will see one continuous waterfall: the root span from Service A, the outgoing HTTP child span, and beneath it the spans from Service B — all sharing the same 32-hex-character trace_id.

The tracestate header carries vendor-specific metadata alongside the W3C context and is forwarded unchanged by all compliant systems.

7. Baggage: Passing Data Through the Trace

Baggage lets you attach key-value pairs to the trace context that automatically propagate to all downstream services without appearing as span attributes unless you explicitly read and set them.

from opentelemetry.baggage import set_baggage, get_baggage
from opentelemetry import context
from opentelemetry.propagate import inject

# At the entry point (e.g., API Gateway or authentication middleware)
ctx = set_baggage("tenant.id", "acme-corp")
ctx = set_baggage("feature.flags", "new-checkout-v2", context=ctx)

# Inject both traceparent and baggage headers into the outgoing request
headers = {}
inject(headers, context=ctx)

# In any downstream service — read without passing anything explicitly
tenant = get_baggage("tenant.id")
if tenant:
    span.set_attribute("tenant.id", tenant)

Use baggage sparingly. It is propagated in every HTTP request header for the entire trace lifetime, so large values add measurable overhead at scale. Good candidates: tenant IDs, A/B test cohort names, feature-flag states. Keep secrets and PII out of baggage entirely.

8. OTel Collector: Why Use It

The Collector is an optional but strongly recommended proxy for production deployments. It decouples your application from the backend: every service exports to localhost:4317, and the Collector handles the rest.

Benefits: - Fan-out: send the same telemetry to Jaeger, Grafana Tempo, and a cloud vendor simultaneously with no application changes. - Batching: coalesces small span payloads into larger network flushes, reducing backend ingest pressure. - Tail-based sampling: keep 100 % of error traces, drop 99 % of healthy traces — a decision that can only be made after a trace completes. - Attribute redaction: strip credit card numbers and PII before they leave your network. - Retry: automatically retry failed exports with backoff.

Minimal otel-collector-config.yaml routing spans to both Jaeger and Grafana Tempo:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s
    send_batch_size: 512
  memory_limiter:
    check_interval: 1s
    limit_mib: 512

exporters:
  otlp/jaeger:
    endpoint: jaeger:14250
    tls:
      insecure: true
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlp/jaeger, otlp/tempo]

The batch processor is the single most important performance configuration in the Collector: it converts dozens of tiny gRPC calls per second into a handful of larger batches.

9. View Traces in Jaeger UI

Start Jaeger all-in-one for local development:

docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

Port 4317 accepts OTLP gRPC directly (no Collector needed in development). Port 16686 serves the web UI.

Set your exporter endpoint to http://localhost:4317 and send a few requests to your FastAPI service. Open http://localhost:16686, select your service from the dropdown, and click "Find Traces." The waterfall view renders each span as a horizontal bar — width equals duration. Hover a span to inspect its attributes and events.

For the two-service demo: trigger GET /checkout/42. The trace should show a root span from Service A (GET /checkout/{order_id}), a child HTTP span (GET http://service-b/process/42), and Service B's spans (GET /process/{order_id} plus any manual spans inside it) — all connected under one trace ID.

10. OTel Backends Comparison

One of OTel's core promises is that switching backends requires only changing the exporter configuration. Here is how the common options compare:

Backend	Deployment	Storage	Key Strengths
Jaeger	Self-hosted	Elasticsearch, Cassandra, in-memory	CNCF graduated; native OTLP; clean UI; easiest local setup
Grafana Tempo	Self-hosted or Grafana Cloud	Object storage (S3, GCS, Azure Blob)	Cost-effective at scale; native Grafana integration with Loki and Prometheus
SigNoz	Self-hosted or cloud	ClickHouse	Full-stack observability (traces + metrics + logs) in a single open-source product
Zipkin	Self-hosted	In-memory, Elasticsearch, MySQL	Lightweight; older standard; smaller ecosystem than Jaeger
Datadog / Honeycomb / Lightstep	SaaS	Managed	Rich UI, ML-assisted analysis, alerting; vendor lock-in at the platform level (not the SDK level)

The key insight: because you are using the OTel SDK, choosing a backend is an operational decision, not a code change. You can start with Jaeger locally, graduate to SigNoz for self-hosted production, and add a SaaS vendor as a secondary destination via the Collector — all without modifying a single line of instrumentation code.

11. FAQ

Do I need the Collector in development? No. Point the OTLP exporter directly at Jaeger's port 4317. The Collector adds operational value in production (batching, fan-out, tail sampling) but is unnecessary for local work.

What is the overhead of OTel instrumentation in production? With BatchSpanProcessor, span creation is under 1 ms per request and network I/O is entirely off the critical path. The memory_limiter processor in the Collector caps memory consumption. Overall, OTel is considered safe for high-throughput production services.

Does context propagation work with asyncio? Yes. OTel uses Python's contextvars.Context internally, which is carried across await points automatically. The with tracer.start_as_current_span(...) context manager works correctly inside both sync and async functions.

How do I instrument a Celery background task? Install opentelemetry-instrumentation-celery and call CeleryInstrumentor().instrument() before starting the worker. It instruments task dispatch and execution and creates a trace link between the producer (e.g., a FastAPI request) and the worker, preserving end-to-end traceability across the queue boundary.

What is the difference between opentelemetry-api and opentelemetry-sdk? opentelemetry-api is the stable interface your application and libraries import. opentelemetry-sdk is the implementation. If the SDK is not configured, all API calls are no-ops with zero overhead — safe for libraries to depend on without forcing the SDK on their users.

How do I verify instrumentation is working without a backend? Add a ConsoleSpanExporter during development:

from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))

Every completed span is printed to stdout in JSON format — no Collector or Jaeger required.

OpenTelemetry has become the default instrumentation layer for cloud-native Python applications. The SDK is production-stable, auto-instrumentation covers the common libraries, and the vendor-neutral OTLP protocol ensures you are never locked into a single observability backend. Start with auto-instrumentation and the Jaeger all-in-one container, add manual spans for business-critical code paths, then introduce the Collector when you need batching, sampling, or multi-backend routing.