Docs/Deploy a Container
Deploy a Container
Run a workload on a GPU shard from the panel or the command line.
You can deploy a sharded container two ways: through the panel, or directly with
docker run.
From the panel
- Open
http://localhost:3000. - Select a GPU instance and drag the memory allocation to the shard size you want.
- Enter the container image and (optionally) a port to expose and registry credentials.
- Click Deploy Workload.
The panel generates the docker run command for you, including the libvgpu preload
and memory limit, and streams the container status back to the UI.
From the command line
The equivalent of a 4 GB shard running PyTorch:
docker run --rm --gpus all \
-e LD_PRELOAD=/libvgpu/build/libvgpu.so \
-e CUDA_DEVICE_MEMORY_LIMIT=4096m \
-p 8080:8080 \
pytorch/pytorch:latest \
python your_script.py
Key flags:
--gpus all— exposes the physical GPU to the container.LD_PRELOAD— loads the interposer that enforces the cap.CUDA_DEVICE_MEMORY_LIMIT— the shard size (see Memory Limits & Shards).
Using a private image
If your image lives in a private registry, log in on the host first:
docker login registry.example.com
Or provide the username and token in the panel's Private Repository fields before deploying.
The container image does not need anything special — the cap is applied from the outside via the preloaded library.