CRITICAL

Root Cause Analysis: Why Docker Fails with 'No Space Left on Device'

Quick Fix Summary

TL;DR

Run `docker system prune -a --volumes` to remove unused containers, images, networks, and volumes.

Docker's 'No Space Left on Device' error occurs when the underlying storage driver (typically overlay2) exhausts allocated disk space or inodes. This is often caused by accumulated container layers, dangling volumes, or improper storage pool configuration.

Diagnosis & Causes

  • Thin pool storage exhaustion in devicemapper or overlay2.
  • Inode exhaustion from excessive container layer creation.
  • Uncleaned dangling volumes and orphaned container layers.
  • Log file accumulation within containers filling host storage.
  • Default Docker storage location on small root partition.
  • Recovery Steps

    1

    Step 1: Immediate Storage Diagnostics

    First, identify what's consuming space: Docker objects vs. host filesystem vs. inodes.

    bash
    docker system df
    df -h /var/lib/docker
    df -i /var/lib/docker
    2

    Step 2: Aggressive Docker System Cleanup

    Remove all unused data (containers, images, networks, build cache, and volumes). Use `-a` to include dangling images.

    bash
    docker system prune -a --volumes --force
    3

    Step 3: Target Specific High-Consumption Objects

    If prune isn't enough, manually remove large containers, images, or volumes identified in Step 1.

    bash
    docker image ls --format "table {{.Size}}\t{{.Repository}}" | sort -hr
    docker volume rm $(docker volume ls -qf dangling=true)
    4

    Step 4: Configure Docker Daemon for Production

    Prevent recurrence by configuring storage driver options and setting log rotation in `/etc/docker/daemon.json`.

    json
    {"storage-driver": "overlay2",
    "storage-opts": ["overlay2.override_kernel_check=true"],
    "log-driver": "json-file",
    "log-opts": {"max-size": "10m", "max-file": "3"}}
    5

    Step 5: Implement Proactive Monitoring & Housekeeping

    Schedule regular cleanup and monitor Docker disk usage. Integrate this into your CI/CD or orchestration platform.

    bash
    # Cron job for weekly cleanup
    0 2 * * 0 docker system prune -a --volumes --filter "until=168h" --force
    # Monitor script snippet
    THRESHOLD=90
    USAGE=$(df /var/lib/docker | awk 'NR==2 {print $5}' | sed 's/%//')
    if [ $USAGE -gt $THRESHOLD ]; then docker system prune -f; fi
    6

    Step 6: Relocate Docker Data Root (If Root Partition is Full)

    If the default `/var/lib/docker` is on a small root partition, move it to a dedicated, larger filesystem.

    bash
    systemctl stop docker
    mkdir -p /mnt/data/docker
    rsync -aqxP /var/lib/docker/ /mnt/data/docker/
    echo '{"data-root": "/mnt/data/docker"}' > /etc/docker/daemon.json
    systemctl start docker

    Architect's Pro Tip

    "The error often points to inode exhaustion, not block storage. Check `df -i`. High churn of small files in containers (like log spam or temp files) can fill inodes while `df -h` shows free space."

    Frequently Asked Questions

    I ran `docker system prune` but space wasn't freed. Why?

    Prune only removes *unused* objects. If a container (even stopped) or a volume is referenced by an image or another container, it's considered 'in use'. Use `docker ps -a` and `docker volume ls` to identify and manually remove specific, non-critical items.

    What's the difference between 'disk full' and 'no space left on device' for Docker?

    They are often the same, but 'no space left' can be more precise to the Docker storage graph driver's thin pool or overlay filesystem, which may hit a limit before the general block device is full, especially in devicemapper loop-lvm mode.

    How do I prevent this in Kubernetes (Docker as runtime)?

    Configure kubelet garbage collection: set `--image-gc-high-threshold`, `--image-gc-low-threshold`, and `--eviction-hard` flags (e.g., `nodefs.available<10%`). Also, use `emptyDir` volume size limits and consider a CSI driver with storage quotas.

    Related Docker Guides