Docs/Deploy a Container

Deploy a Container

Run a workload on a GPU shard from the panel or the command line.

You can deploy a sharded container two ways: through the panel, or directly with docker run.

From the panel

Open http://localhost:3000.
Select a GPU instance and drag the memory allocation to the shard size you want.
Enter the container image and (optionally) a port to expose and registry credentials.
Click Deploy Workload.

The panel generates the docker run command for you, including the libvgpu preload and memory limit, and streams the container status back to the UI.

From the command line

The equivalent of a 4 GB shard running PyTorch:

docker run --rm --gpus all \
  -e LD_PRELOAD=/libvgpu/build/libvgpu.so \
  -e CUDA_DEVICE_MEMORY_LIMIT=4096m \
  -p 8080:8080 \
  pytorch/pytorch:latest \
  python your_script.py

Key flags:

--gpus all — exposes the physical GPU to the container.
LD_PRELOAD — loads the interposer that enforces the cap.
CUDA_DEVICE_MEMORY_LIMIT — the shard size (see Memory Limits & Shards).

Using a private image

If your image lives in a private registry, log in on the host first:

docker login registry.example.com

Or provide the username and token in the panel's Private Repository fields before deploying.

The container image does not need anything special — the cap is applied from the outside via the preloaded library.

How GPU Sharing Works

Memory Limits & Shards