Shrinking Your AI Deployments: Optimizing PyTorch/CUDA Docker Images with uv

A guide to optimizing PyTorch/CUDA Docker images by leveraging pytorch/pytorch base images and uv for efficient dependency management.

Shrinking Your AI Deployments: Optimizing PyTorch/CUDA Docker Images with `uv` Link to heading

Deploying AI models in production often means packaging them into Docker containers. While convenient, these images can quickly grow to colossal sizes, especially when dealing with deep learning frameworks like PyTorch and their CUDA dependencies. Large images lead to slower deployments, increased storage costs, and longer CI/CD cycles.

The Problem with Large Images Link to heading

Many developers build their Docker images from generic base images (like ubuntu or python) and then manually install PyTorch, CUDA, and other libraries. This often results in:

Duplicate Dependencies: Installing CUDA drivers and PyTorch from scratch often pulls in many system libraries that might already be optimized or present in specialized base images.
Version Mismatches: Managing CUDA, cuDNN, PyTorch, and Python versions manually can be a nightmare, leading to runtime errors.
Bloated Layers: Each installation step adds a new layer, increasing the final image size unnecessarily.

Huge Docker image: Your image is about 10GB after installing torch and cuda.

The Solution: Leverage `pytorch/pytorch` and `uv` Link to heading

To combat this, we propose a two-pronged strategy:

Start with an optimized base image: pytorch/pytorch images are pre-configured with PyTorch, CUDA, cuDNN, and often MKL/OpenBLAS, ensuring an optimized and compatible environment.
Efficient dependency management with uv: Use uv (the fast Python package installer and resolver) to manage your project’s specific Python dependencies, critically excluding torch to avoid redundant installations.

Step-by-Step Optimization Link to heading

Let’s walk through how to build a lean, optimized PyTorch/CUDA Docker image.

1. Choose Your `pytorch/pytorch` Base Image Link to heading

The pytorch/pytorch Docker Hub repository offers a variety of images tailored for different CUDA versions and Python environments. Select one that matches your requirements. For example, for CUDA 11.8 and Python 3.10:

FROM pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

Why pytorch/pytorch? These images are maintained by the PyTorch team, providing:

Pre-installed PyTorch, CUDA toolkit, and cuDNN.
Optimized configurations for performance.
Guaranteed compatibility between PyTorch and its underlying CUDA libraries.
Reduced build time for these core components.

2. Introduce `uv` for Fast and Lean Dependency Management Link to heading

uv is a modern, extremely fast Python package installer and resolver. It’s an excellent replacement for pip and can significantly speed up your Docker builds while keeping image sizes small.

First, install uv in your Docker image. Then, use it to export your project’s dependencies to requirements.txt. The crucial step here is to tell uv not to include torch again, as it’s already provided by the base image.

RUN pip install --no-cache-dir uv

COPY pyproject.toml uv.lock .

RUN uv export --format requirements-txt \
    --no-hashes \
    --no-dev \
    --prune torch \
    --prune torchvision \
    -o requirements.txt

A note on requirements.txt and torch:

Ideally, if your pytorch/pytorch base image provides the torch package you need, you should remove torch from your requirements.txt file when building this Docker image. This makes the uv step clean and simple, avoiding any potential conflicts or redundant downloads.

Example requirements.txt (without torch):

numpy==1.26.2
pandas==2.1.3
scikit-learn==1.3.2
transformers==4.35.2
# etc.

Full `Dockerfile` Example Link to heading

FROM python:3.12-slim-trixie AS builder

WORKDIR /build

RUN pip install --no-cache-dir uv

COPY pyproject.toml pyproject.toml

# NOTE: excluding torch and torchvision since runtime image
# already installed
RUN uv export --format requirements-txt \
    --no-hashes \
    --no-dev \
    --prune torch \
    --prune torchvision \
    -o requirements.txt

# Use PyTorch official image with CUDA support (much smaller than building from scratch)
FROM pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime

# Set environment variables
ENV PATH=/usr/local/cuda/bin:$PATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

WORKDIR /app

COPY --from=builder /build/requirements.txt requirements.txt

RUN pip install -r requirements.txt --no-cache-dir

COPY src .

CMD ["python3", "main.py"]

Benefits of this Approach: Link to heading

Significantly Smaller Image Sizes: By avoiding duplicate installations of PyTorch and CUDA dependencies, your final image will be much leaner.
Faster Builds: By using a pre-built base image, the heavy layers of Torch and CUDA are already cached, drastically improving build times. Additionally, uv’s speed and the fact that only lighter, project-specific dependencies need to be reinstalled upon changes further accelerate the build process.
Reduced Conflicts: Relying on the pytorch/pytorch image for the core ML stack minimizes the risk of version conflicts between CUDA, cuDNN, and PyTorch.
Easier Maintenance: The PyTorch team handles the heavy lifting of optimizing the base environment. You just manage your application-specific libraries.

Conclusion: Link to heading

Optimizing Docker images for deep learning is a critical step towards efficient MLOps. By combining the power of the pre-optimized pytorch/pytorch base images with the blazing-fast and dependency-aware uv installer, you can drastically reduce your image sizes, accelerate your CI/CD pipelines, and streamline your AI model deployments. Give it a try and experience the difference!

Shrinking Your AI Deployments: Optimizing PyTorch/CUDA Docker Images with uv Link to heading

The Problem with Large Images Link to heading

The Solution: Leverage pytorch/pytorch and uv Link to heading