Grafana Loki Tutorial 2026: Log Aggregation Without the ELK Complexity
The Elastic Stack (Elasticsearch, Logstash, Kibana) has been the default answer to log aggregation for over a decade. It works. It also demands a JVM cluster with 16 GB or more of RAM on day one, a dedicated operations team to tune heap sizes, shard counts, and index lifecycle policies, and an Elastic license that grows expensive as your data does. Grafana Loki offers a different trade-off: it indexes only the metadata labels attached to each log stream, stores the raw log lines compressed in object storage, and offloads all the expensive full-text indexing to query time. The result is a system that runs comfortably on a single node for most small-to-medium workloads and scales horizontally when you need it.
Loki has seen roughly 80% year-over-year growth in adoption through 2025 and into 2026, driven primarily by teams already running Prometheus and Grafana who want a unified observability stack without adding a second complex technology. This tutorial covers everything from first install to production-grade Kubernetes log collection.
TL;DR
- Loki stores logs as compressed chunks in object storage (S3, GCS, local filesystem). It indexes only stream labels, not log content, making storage 10x–100x cheaper than Elasticsearch for the same volume.
- The PLG stack is Promtail (log shipper) + Loki (storage and query engine) + Grafana (visualization). Docker Compose gets you running in under five minutes.
- LogQL is the query language. It looks like PromQL with a log selector prepended:
{app="nginx"} |= "error". - Promtail runs as a DaemonSet on Kubernetes, reading from
/var/log/podsand auto-discovering pods via the Kubernetes API. - Loki supports multi-tenancy via the
X-Scope-OrgIDHTTP header, structured log parsing via pipeline stages, and alerting via Grafana alert rules or the built-in ruler component. - For full-text search at write time, regex-heavy analytics, or long-retention compliance data, Elasticsearch is still the better tool. Loki wins on cost, operational simplicity, and Prometheus-native workflows.
What Is Loki? Loki vs ELK Stack vs Splunk
Loki is an open-source log aggregation system developed by Grafana Labs and released in 2018. Its design philosophy is deliberately borrowed from Prometheus: logs are identified by a set of key-value labels, and the storage backend holds only those labels in an index. Everything else — the raw log line — is stored compressed and unindexed in chunks.
When you query, Loki fetches the chunks that match your label selectors, decompresses them, and then applies filter expressions to find matching lines. This means reads are slightly more expensive than Elasticsearch for highly selective full-text queries, but writes and storage are far cheaper because there is no per-token index to maintain.
Cost and complexity comparison (2026 estimates for 50 GB/day ingest):
| Dimension | Loki | ELK Stack | Splunk Cloud |
|---|---|---|---|
| Minimum RAM (single node) | 2 GB | 16 GB | Managed |
| Storage per GB of logs | ~0.10 GB (compressed, S3) | ~1.5 GB (index + data) | ~0.5 GB |
| License | Apache 2.0 | SSPL (self-managed) | Proprietary |
| Estimated monthly cost (cloud) | $15–50 | $300–800 | $1,500+ |
| Full-text index at write time | No | Yes | Yes |
| Native Prometheus integration | Yes | Via exporter | Via exporter |
| Kubernetes auto-discovery | Via Promtail | Via Filebeat | Via Splunk Connect |
The ELK stack numbers assume a self-managed deployment on EC2 or equivalent with adequate heap memory. Splunk costs are based on their ingest-volume licensing model. Loki costs assume S3 storage at standard pricing and minimal compute for the Loki process itself.
The headline trade-off is this: ELK and Splunk index every token of every log line at ingest time. Queries are fast regardless of log volume because the index narrows results instantly. Loki defers that work to query time. If you need to search for a specific string across all logs without any label filter, Loki will scan every chunk, which is slow and expensive. If you structure your logs with good labels and use label selectors on every query, Loki is extremely fast and uses a fraction of the resources.
Loki Architecture
Loki can run as a single binary (the default for development) or as individual microservices (recommended for production scale). The key components are:
Distributor — the write path entry point. It receives log streams from shippers like Promtail, validates them, and fans them out to multiple ingesters using consistent hashing on the stream's label fingerprint. It also applies rate limiting and validation rules (max label count, max line size).
Ingester — holds log chunks in memory, grouped by stream. It accumulates lines until the chunk reaches a configured size or age, then flushes compressed chunks to the storage backend. Ingesters maintain a write-ahead log (WAL) so in-flight data survives a restart.
Querier — handles read requests. It queries both the ingesters (for recent data still in memory) and the long-term storage backend (for older chunks). It executes the actual LogQL filter logic after fetching chunks.
Query Frontend — an optional component that sits in front of queriers. It splits long-range queries into smaller shards, caches results, and retries failed sub-queries. It is the main reason large time-range queries are practical in production.
Compactor — runs as a background process to merge many small index files produced by separate ingester flushes into fewer, larger files. It also enforces retention by deleting chunks older than the configured retention period.
Ruler — evaluates recording rules and alerting rules against log data on a schedule, similar to the Prometheus ruler. Required if you want Loki-native alerts rather than Grafana alert rules.
Storage backend — Loki supports local filesystem (monolithic mode only), S3-compatible object stores (Amazon S3, MinIO, DigitalOcean Spaces), Google Cloud Storage, and Azure Blob Storage. The index can use the TSDB index format (recommended since Loki 2.8), BoltDB Shipper (legacy), or Cassandra/Bigtable for very large deployments.
For teams starting out, running Loki in single-binary mode with local filesystem storage is perfectly fine. When you outgrow a single node, switching to object storage requires only a configuration change — the data format is the same.
Install the PLG Stack with Docker Compose
The following Docker Compose file brings up Promtail, Loki, and Grafana on a single host. It creates a named volume for Loki's local storage and mounts /var/log into Promtail for log scraping.
Create a project directory and the required config files:
mkdir -p ~/plg-stack/config && cd ~/plg-stack
config/loki-config.yaml — Loki server configuration:
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
ingestion_rate_mb: 16
ingestion_burst_size_mb: 32
analytics:
reporting_enabled: false
config/promtail-config.yaml — Promtail configuration:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
host: docker-host
__path__: /var/log/*.log
- job_name: containers
static_configs:
- targets:
- localhost
labels:
job: containerlogs
__path__: /var/lib/docker/containers/*/*-json.log
pipeline_stages:
- json:
expressions:
log: log
stream: stream
time: time
- labels:
stream:
- output:
source: log
docker-compose.yaml:
version: "3.8"
networks:
plg:
volumes:
loki-data:
grafana-data:
services:
loki:
image: grafana/loki:3.4.2
container_name: loki
ports:
- "3100:3100"
volumes:
- ./config/loki-config.yaml:/etc/loki/local-config.yaml
- loki-data:/loki
command: -config.file=/etc/loki/local-config.yaml
networks:
- plg
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--output-document=-", "http://localhost:3100/ready"]
interval: 10s
timeout: 5s
retries: 5
promtail:
image: grafana/promtail:3.4.2
container_name: promtail
volumes:
- ./config/promtail-config.yaml:/etc/promtail/config.yaml
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
command: -config.file=/etc/promtail/config.yaml
networks:
- plg
depends_on:
loki:
condition: service_healthy
restart: unless-stopped
grafana:
image: grafana/grafana:11.6.0
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=changeme
- GF_USERS_ALLOW_SIGN_UP=false
networks:
- plg
depends_on:
- loki
restart: unless-stopped
Start the stack:
docker compose up -d
docker compose ps
All three services should show as healthy within about 30 seconds. Open Grafana at http://localhost:3000, log in with admin / changeme, navigate to Connections > Data Sources > Add data source, select Loki, and set the URL to http://loki:3100. Save and test. You should see "Data source connected and labels found."
Configure Promtail: Scraping Log Files with Labels
Promtail is the recommended log shipper for Loki, though Loki also accepts logs from Fluentd, Fluentbit, the OpenTelemetry Collector, and the Loki Docker logging driver.
Labels are the most important design decision in a Loki deployment. Because Loki only indexes labels, your ability to narrow down queries efficiently depends entirely on choosing labels that are high cardinality enough to be useful but not so high cardinality that they explode the number of streams.
Good label candidates: app, env (production/staging), host, namespace, pod, level (info/warn/error).
Bad label candidates: user_id, request_id, trace_id — these have millions of distinct values and create millions of separate streams, which degrades ingester performance and increases memory usage.
A more complete Promtail scrape config for a server running multiple applications:
scrape_configs:
- job_name: nginx
static_configs:
- targets:
- localhost
labels:
app: nginx
env: production
host: web-01
__path__: /var/log/nginx/access.log
pipeline_stages:
- regex:
expression: '^(?P<remote_addr>\S+) - (?P<remote_user>\S+) \[(?P<time_local>[^\]]+)\] "(?P<request>[^"]+)" (?P<status>\d+) (?P<body_bytes_sent>\d+)'
- labels:
status:
- metrics:
http_requests_total:
type: Counter
description: "Total HTTP requests seen in nginx access log"
source: status
config:
action: inc
- job_name: app-json
static_configs:
- targets:
- localhost
labels:
app: myservice
env: production
__path__: /var/log/myservice/*.log
pipeline_stages:
- json:
expressions:
level: level
msg: message
ts: timestamp
- labels:
level:
- timestamp:
source: ts
format: RFC3339Nano
The pipeline_stages block processes each log line before it is sent to Loki. You can extract fields from JSON or regex, promote extracted fields to labels, rewrite the log line, and even drop lines that match certain conditions.
LogQL Basics: Stream Selectors and Filter Expressions
LogQL is Loki's query language. Every query starts with a log stream selector — a set of label matchers enclosed in curly braces — followed by optional pipeline expressions.
Label matchers use four operators:
# Exact match
{app="nginx"}
# Regex match
{app=~"nginx|apache"}
# Negative exact match
{app!="nginx"}
# Negative regex
{app!~"nginx|apache"}
Filter expressions narrow results after the stream is selected:
# Lines containing the string "error"
{app="nginx"} |= "error"
# Lines NOT containing "health"
{app="nginx"} != "healthcheck"
# Lines matching a regex
{app="nginx"} |~ "5[0-9]{2}"
# Lines NOT matching a regex
{app="nginx"} !~ "2[0-9]{2}"
Pattern matching — Loki's |pattern expression uses a simplified pattern syntax that is faster than regex for common log formats:
{app="nginx"} | pattern `<ip> - - [<_>] "<method> <path> <_>" <status> <size>`
| status >= "500"
Parser expressions — extract structured fields and make them available for filtering and formatting:
# Parse JSON log lines
{app="myservice"} | json | level="error"
# Parse logfmt (key=value format)
{app="myservice"} | logfmt | duration > 500ms
# Regex extraction
{app="nginx"} | regexp `(?P<status>\d{3}) (?P<bytes>\d+)$`
Line format — rewrite the displayed log line using extracted fields:
{app="myservice"} | json | line_format "{{.level}} {{.message}}"
Multiple pipeline stages chain with the pipe character. The query is evaluated left to right, so place the most selective filters first to minimize the data processed by later stages.
LogQL Metric Queries
Metric queries wrap a log pipeline in an aggregation function to produce time-series data, exactly like PromQL. This is what lets you build graphs and alerts from log data.
count_over_time — count log lines in a time range:
# Request rate per second, averaged over the last 5 minutes
rate({app="nginx"}[5m])
# Total errors in the last hour
count_over_time({app="nginx"} |= "error" [1h])
# Error rate as a fraction of total requests
sum(rate({app="nginx"} |= "error" [5m]))
/
sum(rate({app="nginx"}[5m]))
bytes_over_time and bytes_rate — measure log volume:
# Log volume in bytes per second
bytes_rate({app="nginx"}[5m])
# Total log volume per app over the last day
sum by (app) (bytes_over_time({app=~".+"}[24h]))
avg_over_time on extracted numeric fields:
# Average response time from JSON logs
avg_over_time(
{app="myservice"} | json | unwrap duration_ms [5m]
)
# 99th percentile latency
quantile_over_time(0.99,
{app="myservice"} | json | unwrap duration_ms [5m]
)
The unwrap expression extracts a numeric value from a parsed label, converting it to a metric. Without unwrap, metric aggregations count or size log lines; with unwrap, they aggregate numeric values extracted from the log content.
Aggregations mirror PromQL:
# Error count by application and environment
sum by (app, env) (
count_over_time({env="production"} |= "error" [5m])
)
# Top 10 apps by log volume
topk(10, sum by (app) (bytes_rate({env="production"}[5m])))
Grafana Dashboards: Log Panels and Derived Fields
Once Loki is configured as a data source in Grafana, you can build dashboards that mix log panels with metric graphs.
Adding a Logs panel:
- Create a new dashboard, add a panel, and select Logs as the visualization type.
- Set the data source to Loki.
- Enter a LogQL query such as
{app="nginx", env="production"} |= "error". - Enable Deduplication if your logs contain repeated identical lines.
- Enable Show time and Wrap lines in the panel options.
Variables for dynamic dashboards — add a dashboard variable to let users filter by application:
- Variable type: Query
- Data source: Loki
- Query:
label_values(app)— returns all distinct values of theapplabel - Use
$appin your panel queries:{app="$app", env="$env"}
Derived fields link log lines to traces in Tempo or Jaeger. In the Loki data source settings, add a derived field:
- Name:
TraceID - Regex:
traceID=(\w+)(or"trace_id":"(\w+)"for JSON logs) - URL:
http://tempo:3200/jaeger/api/traces/${__value.raw}(for Tempo) - Internal link: enabled, pointing to your Tempo data source
When a log line contains a trace ID matching the regex, Grafana renders it as a clickable link that jumps directly to the corresponding trace in Tempo. This is the core of the Grafana observability loop: logs link to traces, traces link back to metrics.
Kubernetes Log Collection: Promtail DaemonSet with Helm
On Kubernetes, every pod writes its logs to /var/log/pods/<namespace>_<pod>_<uid>/<container>/<n>.log on the node's filesystem. Promtail runs as a DaemonSet — one pod per node — to read those files and ship them to Loki with automatically discovered labels.
Install with Helm:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Install Loki in simple-scalable mode
helm install loki grafana/loki \
--namespace monitoring \
--create-namespace \
--set loki.auth_enabled=false \
--set loki.commonConfig.replication_factor=1 \
--set loki.storage.type=filesystem \
--set singleBinary.replicas=1
# Install Promtail pointing at the Loki service
helm install promtail grafana/promtail \
--namespace monitoring \
--set config.clients[0].url=http://loki-gateway.monitoring.svc.cluster.local/loki/api/v1/push
Custom values.yaml for Promtail to add extra labels and drop noisy system logs:
config:
clients:
- url: http://loki-gateway.monitoring.svc.cluster.local/loki/api/v1/push
snippets:
extraScrapeConfigs: |
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
pipeline_stages:
- cri: {}
- json:
expressions:
level: level
msg: message
- labels:
level:
- drop:
expression: ".*kube-probe.*"
drop_counter_reason: healthcheck
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
target_label: app
- source_labels: [__meta_kubernetes_namespace]
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
target_label: pod
- source_labels: [__meta_kubernetes_pod_container_name]
target_label: container
- source_labels: [__meta_kubernetes_pod_node_name]
target_label: node
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
Apply the custom values:
helm upgrade promtail grafana/promtail \
--namespace monitoring \
-f values-promtail.yaml
With this configuration, every pod in the cluster is automatically scraped. Labels app, namespace, pod, container, and node are available as Loki stream labels, letting you query {namespace="production", app="api-server"} |= "panic" without any manual configuration per application.
Structured Logging: Parsing JSON Logs with Pipeline Stages
Modern applications typically emit JSON-structured logs. Loki's pipeline stages let you parse that structure and extract fields for use as labels, filters, and metrics.
Consider an application emitting logs like this:
{"timestamp":"2026-05-14T10:23:41Z","level":"error","service":"payment","traceId":"4bf92f3577b34da6","message":"charge failed","user_id":42891,"amount":99.99,"error":"card_declined"}
A Promtail pipeline to handle this log format:
pipeline_stages:
# Parse the entire line as JSON
- json:
expressions:
level: level
service: service
trace_id: traceId
message: message
error_type: error
# Promote safe low-cardinality fields to labels
- labels:
level:
service:
# Drop debug logs in production to reduce volume
- drop:
source: level
expression: "debug"
drop_counter_reason: debug_dropped
# Rewrite the log line to a cleaner format
- output:
source: message
# Extract timestamp from the log line itself
- timestamp:
source: timestamp
format: RFC3339
After this pipeline runs, each log line has level and service as indexed labels. The message field becomes the displayed log line. The trace_id is available as an extracted field for derived field linking but is not promoted to a label (avoiding high cardinality).
You can then query:
{service="payment", level="error"} | json | error_type="card_declined"
The json parser in the query re-parses the stored log line, so you can filter on any field even if it was not promoted to a label at ingest time. The label filter service="payment" narrows the chunks to fetch; the json | error_type="card_declined" filter runs against the decompressed lines.
Multi-Tenancy with Loki
Loki supports multiple tenants through HTTP header-based isolation. When auth_enabled: true is set in the Loki configuration, every API request must include an X-Scope-OrgID header. Requests with different org IDs are stored and queried in completely separate namespaces — one tenant cannot see another's data.
Enable multi-tenancy in loki-config.yaml:
auth_enabled: true
Configure Promtail to send a tenant ID:
clients:
- url: http://loki:3100/loki/api/v1/push
tenant_id: team-alpha
For Kubernetes deployments where different namespaces belong to different teams, you can use Promtail's relabeling to derive the tenant ID from the Kubernetes namespace label:
clients:
- url: http://loki:3100/loki/api/v1/push
tenant_id: fake # default, overridden by relabeling
scrape_configs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_namespace]
target_label: __tenant_id__
When querying in Grafana, set the X-Scope-OrgID header in the Loki data source configuration under HTTP Headers. You can provision separate Grafana data sources pointing at the same Loki instance with different tenant IDs, then use Grafana's team/folder permissions to control which teams can see which data source.
Alerting: Grafana Alert Rules on Log Patterns
There are two ways to alert on Loki data: Grafana alert rules (evaluated by the Grafana server) and Loki ruler rules (evaluated by Loki itself, similar to Prometheus recording and alerting rules). Both work; Grafana alert rules are easier to configure through the UI.
Grafana alert rule for high error rate:
- Navigate to Alerting > Alert Rules > New alert rule.
- Set the data source to Loki.
- Enter a metric query:
sum(rate({app="api-server", env="production"} |= "error" [5m])) > 0.1
- Set the condition: IS ABOVE threshold
0.1(errors per second). - Set evaluation interval to
1m, pending period to5m(fires only after the condition holds for 5 minutes to reduce noise). - Add labels:
severity=critical,team=backend. - Add a contact point (Slack, PagerDuty, email) in Alerting > Contact Points.
Loki ruler alerting rules (for alerts evaluated independently of Grafana):
# config/loki-rules.yaml
groups:
- name: log-alerts
interval: 1m
rules:
- alert: HighErrorRate
expr: |
sum(rate({app="api-server", env="production"} |= "error" [5m])) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate in api-server"
description: "Error rate is {{ $value | printf \"%.2f\" }} errors/sec"
- alert: PanicDetected
expr: |
count_over_time({app=~".+", env="production"} |= "panic" [1m]) > 0
for: 0m
labels:
severity: page
annotations:
summary: "Panic detected in {{ $labels.app }}"
Place this file in the Loki rules directory (configured as /loki/rules in the example above) under a tenant subdirectory: /loki/rules/fake/loki-rules.yaml. Configure the ruler in loki-config.yaml:
ruler:
storage:
type: local
local:
directory: /loki/rules
rule_path: /tmp/loki-rules
alertmanager_url: http://alertmanager:9093
ring:
kvstore:
store: inmemory
enable_api: true
Loki vs Elasticsearch: When to Choose Each
Choose Loki when:
- Your team already runs Prometheus and Grafana. The label-based mental model is identical, and you avoid adding a second observability paradigm.
- Your log volume is growing fast and storage cost is a concern. Loki's object storage backend with compression typically achieves 10x–20x better storage efficiency than Elasticsearch.
- You run Kubernetes and want zero-configuration per-pod log collection with automatic label discovery.
- Your log queries are primarily filtered by application, environment, and severity — the kind of queries labels handle efficiently.
- You want a simple operational footprint. A single Loki binary can replace an entire Elasticsearch cluster for many workloads.
Choose Elasticsearch when:
- You need full-text search across log content without any label pre-filtering. Searching for an arbitrary string across all logs is fast in Elasticsearch because every token is indexed; in Loki it requires a full chunk scan.
- You need complex aggregations at query time (cardinality, percentiles on arbitrary fields) without schema pre-planning.
- You are ingesting logs from sources where you cannot add structured labels at collection time (legacy syslog, third-party appliances).
- Your compliance requirements mandate specific data retention, audit, and chain-of-custody features that commercial Elasticsearch distributions provide.
- Your team has existing Kibana dashboards and operational expertise that would be expensive to rebuild.
A common pattern in 2026 is running both: Loki for application logs (high-volume, label-friendly, cost-sensitive) and Elasticsearch for security event logs and audit trails (lower volume, full-text search required).
Retention and Storage: Object Storage Configuration
For production deployments, local filesystem storage is not appropriate — it cannot be shared across multiple Loki instances and is not durable without additional backup infrastructure. Switch to S3 or GCS.
S3 configuration in loki-config.yaml:
common:
storage:
s3:
endpoint: s3.amazonaws.com
region: us-east-1
bucketnames: my-loki-chunks
access_key_id: ${AWS_ACCESS_KEY_ID}
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
s3forcepathstyle: false
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/index_cache
aws:
s3: s3://my-loki-chunks
region: us-east-1
compactor:
working_directory: /loki/compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
retention_delete_worker_count: 150
limits_config:
retention_period: 744h # 31 days
For per-tenant retention (available when auth_enabled: true):
limits_config:
retention_period: 744h # default for all tenants
# Override per tenant in the Loki runtime config
per_tenant_override_config: /etc/loki/runtime-config.yaml
runtime-config.yaml (hot-reloaded without restart):
overrides:
team-alpha:
retention_period: 2160h # 90 days for compliance team
team-beta:
retention_period: 168h # 7 days for development team
ingestion_rate_mb: 8
GCS configuration:
common:
storage:
gcs:
bucket_name: my-loki-bucket
storage_config:
gcs:
bucket_name: my-loki-bucket
For GCS, authenticate via Workload Identity on GKE or set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to a service account key file.
MinIO (self-hosted S3-compatible) is a good option for on-premises deployments or local testing with an S3-compatible API:
docker run -d --name minio \
-p 9000:9000 -p 9001:9001 \
-e MINIO_ROOT_USER=minioadmin \
-e MINIO_ROOT_PASSWORD=minioadmin \
minio/minio server /data --console-address ":9001"
Then set endpoint: minio:9000 and s3forcepathstyle: true in the Loki S3 config.
FAQ
Q: Can I migrate existing Elasticsearch or Splunk data to Loki? A: Not directly — the index formats are entirely different. For historical data, the common approach is to set both systems to run in parallel during a transition period (typically 30–90 days matching your retention window), then decommission the old system once the Loki retention covers the required history. There is no official migration tool.
Q: How does Loki handle log lines out of order? A: Loki requires that log lines within a given stream are ingested in roughly chronological order. Lines arriving more than the max_chunk_age (default 2 hours) out of order are rejected when reject_old_samples: true is set. Adjust reject_old_samples_max_age if your application has long delays between log generation and shipping.
Q: What is the maximum log line size? A: The default maximum line size is 256 KB, configurable via max_line_size in limits_config. Lines exceeding this limit are dropped at the distributor.
Q: Can I use Loki without Promtail? A: Yes. Loki exposes a standard HTTP push API (/loki/api/v1/push) and is compatible with Fluentd (via the fluent-plugin-grafana-loki plugin), Fluent Bit (native output plugin), the OpenTelemetry Collector (otlp_http or otlp_grpc exporters), Vector, Logstash (via logstash-output-loki), and the Docker logging driver (loki log driver). Many teams use Fluent Bit on Kubernetes for lower resource usage than Promtail.
Q: How do I debug missing logs in Loki? A: Check the Promtail /metrics endpoint (port 9080 by default) for promtail_sent_entries_total and promtail_dropped_entries_total. Check /targets to verify that Promtail discovered your log files. Check Loki's /metrics for loki_distributor_bytes_received_total. Check Grafana's Explore view with a very broad selector ({job=~".+"}) to confirm data is arriving. The Promtail positions.yaml file tracks read offsets — deleting it will re-read all log files from the beginning.
Q: Is Loki production-ready in 2026? A: Yes. Grafana Labs runs Loki at petabyte scale internally, and large companies including Adobe, Deutsche Telekom, and Snyk run Loki in production. The TSDB index format (stable since Loki 2.8) and the microservices deployment mode are both mature. The main operational risk is cardinality explosion from poorly chosen labels — address this by reviewing stream counts regularly with {job=~".+"} | count by (stream) in Explore.
Sources
- Grafana Loki documentation: https://grafana.com/docs/loki/latest/
- Loki architecture overview: https://grafana.com/docs/loki/latest/get-started/architecture/
- LogQL documentation: https://grafana.com/docs/loki/latest/query/
- Promtail pipeline stages: https://grafana.com/docs/loki/latest/send-data/promtail/stages/
- Grafana Loki Helm chart: https://grafana.github.io/helm-charts
- Loki storage configuration: https://grafana.com/docs/loki/latest/configure/storage/
- Loki multi-tenancy: https://grafana.com/docs/loki/latest/operations/multi-tenancy/
- Grafana alerting with Loki: https://grafana.com/docs/grafana/latest/alerting/
- Loki release notes v3.x: https://github.com/grafana/loki/releases
- Comparison: Loki vs Elasticsearch for log aggregation — Grafana blog, 2025: https://grafana.com/blog/