}

How to Reduce Docker Image Size: Multi-Stage Builds and Best Practices (2026)

How to Reduce Docker Image Size: Multi-Stage Builds and Best Practices (2026)

Last updated: March 2026

Large Docker images slow CI/CD pipelines, consume registry storage, and increase deploy times. The most impactful technique is the multi-stage build, which discards build tools from the final image. Combined with Alpine base images, .dockerignore, and RUN command chaining, it is routine to shrink a 1+ GB image down to under 100 MB. This guide shows every technique with real before/after comparisons.


Why Image Size Matters

  • Pull time: A 1.2 GB image takes ~60 seconds to pull on a 150 Mbps connection. A 120 MB image takes ~6 seconds.
  • Storage costs: Registry storage and egress fees add up at scale.
  • Security surface: Fewer packages mean fewer CVEs to patch.
  • CI/CD speed: Smaller images build and push faster, shortening feedback loops.
  • Kubernetes scheduling: Large images slow pod startup during autoscaling events.

Before: A Bloated Node.js Dockerfile

This is what many projects start with:

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
EXPOSE 3000
CMD ["node", "dist/server.js"]
docker build -t myapp:bloated .
docker images myapp:bloated
# myapp   bloated   sha256:...   1.24GB

The image is 1.24 GB because: - node:20 base image includes the full Debian distribution (~1 GB) - All node_modules (including dev dependencies) are included - Build tools are bundled into the final image - Source TypeScript files and test files are copied in


Technique 1: Multi-Stage Builds (The Builder Pattern)

Multi-stage builds use multiple FROM statements in one Dockerfile. Each stage can copy files from previous stages, but only the final stage becomes the image. Build tools and intermediate artifacts are discarded.

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production=false
COPY . .
RUN npm run build

# Stage 2: Production image
FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]
docker build -t myapp:multistage .
docker images myapp:multistage
# myapp   multistage   sha256:...   182MB

The final image dropped from 1.24 GB to 182 MB — an 85% reduction. The builder stage with TypeScript compiler and dev dependencies never ships to production.

Multi-stage build for Go (produces a single static binary):

# Stage 1: Build
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o server .

# Stage 2: Minimal runtime
FROM scratch
COPY --from=builder /app/server /server
EXPOSE 8080
CMD ["/server"]
# Go app from scratch — no OS at all
docker images mygoapp
# mygoapp   latest   sha256:...   12.3MB

The scratch base image is completely empty — just the static binary. This is only possible for languages that compile to self-contained binaries.

Multi-stage build for Python:

# Stage 1: Build dependencies
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
EXPOSE 8000
CMD ["gunicorn", "app:application"]

Technique 2: Choose the Right Base Image

Base image choice is the single biggest driver of image size. Compare common options:

Base Image Size Use Case
ubuntu:24.04 78 MB General purpose, familiar tools
debian:bookworm-slim 74 MB Debian without extras
python:3.12 1.02 GB Python (Debian-based, includes build tools)
python:3.12-slim 131 MB Python without build extras
python:3.12-alpine 57 MB Python on Alpine Linux
node:20 1.1 GB Node.js (full Debian)
node:20-slim 240 MB Node.js slim
node:20-alpine 135 MB Node.js on Alpine
golang:1.22-alpine 236 MB Go build stage
scratch 0 B Empty — for static binaries only
gcr.io/distroless/base 20 MB Google distroless — no shell, no package manager

Alpine Linux is based on musl libc and busybox. It is dramatically smaller than Debian/Ubuntu based images but occasionally has compatibility issues with packages that assume glibc. Test Alpine thoroughly before adopting it in production.

Distroless images (from Google) contain only the runtime and its dependencies — no shell, no package manager, no utilities. They are the most secure option but make debugging inside containers impossible.


Technique 3: Create a .dockerignore File

Without .dockerignore, COPY . . copies everything into the build context, including files Docker does not need and that inflate the image.

Create .dockerignore in the same directory as your Dockerfile:

# Node.js
node_modules
npm-debug.log
.npm

# Python
__pycache__
*.pyc
*.pyo
.venv
venv

# Version control
.git
.gitignore

# Documentation
*.md
docs/

# Tests
test/
tests/
*.test.ts
*.spec.ts
coverage/

# CI/CD
.github/
.gitlab-ci.yml
Jenkinsfile

# OS files
.DS_Store
Thumbs.db

# Editor files
.vscode/
.idea/
*.swp
*.swo

# Build output (if building outside Docker)
dist/
build/

Check what is being sent to the build context:

# Before building, list what would be sent
docker build --no-cache --progress=plain . 2>&1 | head -20

Technique 4: Chain RUN Commands to Reduce Layers

Each RUN, COPY, and ADD instruction creates a new layer. Layers accumulate even if you delete files in a later RUN — the deleted data still exists in the earlier layer.

Bad — three layers, deleted files still stored:

RUN apt-get update
RUN apt-get install -y curl wget build-essential
RUN apt-get remove -y build-essential && apt-get autoremove -y

Good — one layer, cleanup happens in the same layer:

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        curl \
        wget \
        build-essential \
    && apt-get remove -y build-essential \
    && apt-get autoremove -y \
    && rm -rf /var/lib/apt/lists/*

The --no-install-recommends flag prevents apt from installing optional packages. The rm -rf /var/lib/apt/lists/* removes the package index cache (which can be 50-100 MB).


Technique 5: Use BuildKit and Layer Caching

Enable BuildKit for faster builds with better cache handling:

export DOCKER_BUILDKIT=1
docker build -t myapp .

Or set it permanently in daemon.json:

{
  "features": {"buildkit": true}
}

Structure your Dockerfile to maximize cache hits. Dependencies change less often than source code, so copy and install dependencies before copying source:

# Good: dependencies cached separately from source code
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./          # Only invalidated when package.json changes
RUN npm ci --only=production   # Cached unless package.json changed
COPY . .                       # Invalidated when any source file changes
CMD ["node", "server.js"]

Technique 6: Analyze Images with dive

dive is a terminal UI for exploring Docker image layers and identifying what is consuming space.

Install:

# Ubuntu/Debian
wget https://github.com/wagoodman/dive/releases/latest/download/dive_linux_amd64.deb
sudo dpkg -i dive_linux_amd64.deb

# macOS
brew install dive

Analyze an image:

dive myapp:latest

dive shows a layer-by-layer breakdown with file changes, sizes, and an efficiency score. It highlights files that could be removed and layers that cancel out each other's changes.

Use it in CI to fail builds that exceed a size threshold:

CI=true dive myapp:latest --highestUserWastedPercent 0.10
# Fails if more than 10% of image size is wasted

Before/After Size Comparison

Scenario Before After Reduction
Node.js web app 1.24 GB 182 MB 85%
Python Flask API 1.1 GB 118 MB 89%
Go microservice 1.05 GB 12 MB 99%
Java Spring Boot 680 MB 200 MB 71%

The Go result (using scratch) is extreme but realistic for statically linked Go binaries. Java benefits from JLink or GraalVM native-image compilation.


Quick Wins Summary

# 1. Check current image size
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"

# 2. Build with a specific target stage (useful for debugging)
docker build --target builder -t myapp:debug .

# 3. Inspect layers
docker history myapp:latest

# 4. Inspect with dive
dive myapp:latest

# 5. Remove dangling images (untagged build intermediaries)
docker image prune -f

# 6. Remove all unused images
docker image prune -a

Related Articles


FAQ

Q: Should I always use Alpine base images? Alpine is a good default for new projects, but test it thoroughly. Some Python packages with C extensions compile differently against musl libc vs glibc, causing runtime errors that are not caught at build time. The slim variants of official images (e.g., python:3.12-slim, node:20-slim) are a safer middle ground — they are Debian-based so compatibility is not an issue, but they remove build tools and unnecessary packages.

Q: Multi-stage builds make my Dockerfile longer. Is it worth it? Yes, almost always. An 85% reduction in image size means faster CI pipelines, lower registry costs, reduced pull times, and a smaller security attack surface. The Dockerfile is longer but each stage is easy to understand and test independently using --target.

Q: How do I share files between multiple build stages efficiently? Use COPY --from=<stage> to copy specific files or directories. You can also name an intermediate stage and reference it from multiple downstream stages. For build artifacts that are reused across many projects, consider creating a separate base image with your common dependencies pre-installed and push it to your private registry.