GPU Shards LogoShards
  • Home
  • Docs
  • Blog
Get Started

Getting Started

  • Introduction
  • Quick Start
  • Manual Installation

Guides

  • How GPU Sharing Works
  • Deploy a Container
  • Memory Limits & Shards

Reference

  • Troubleshooting
  • License
GPU Shards LogoShards

Carve one NVIDIA GPU into memory-isolated slices for multiple containers.

Product

  • Overview
  • Pricing
  • Marketplace
  • Features
  • Integrations

Company

  • About
  • Team
  • Blog
  • Careers
  • Contact

Support

  • Help center
  • Documentation
  • Status
  • Community

© 2026 GPU Shards. All rights reserved.

  • Terms and Conditions
  • Privacy Policy
Docs/Troubleshooting

Troubleshooting

Common issues when installing and running GPU Shards.

nvidia-smi works on the host but not in containers

The NVIDIA Container Toolkit is likely not configured as the Docker runtime. Re-run:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Then test:

docker run --rm --gpus all nvidia/cuda:12.2.2-base-ubuntu22.04 nvidia-smi

A container ignores its memory limit

Confirm both pieces are present:

  • LD_PRELOAD=/libvgpu/build/libvgpu.so is set.
  • CUDA_DEVICE_MEMORY_LIMIT is set to a value with a unit (e.g. 4096m).

If LD_PRELOAD points at a path that does not exist inside the image, the library is silently skipped and the cap is not applied. Use the provided hami-core-demo:latest image, which has the library at that path.

"permission denied" talking to the Docker daemon

Your user is not in the docker group yet, or you have not started a new session since being added:

sudo usermod -aG docker "$USER"
# then log out and back in

The panel can't reach the backend

Make sure both services are running — the frontend on :3000 and the backend on :8000. If you started them with run.sh, check its output for errors. Confirm nothing else is bound to those ports:

sudo lsof -i :3000 -i :8000

CUDA out of memory immediately on start

The shard is too small for the model plus the CUDA context. Increase CUDA_DEVICE_MEMORY_LIMIT or pick a larger shard in the panel. See Memory Limits & Shards for sizing guidance.

Still stuck?

Check the HAMi-core documentation for the underlying library, or revisit the manual install guide.

Previous
Memory Limits & Shards
Next
License

On This Page

  • nvidia-smi works on the host but not in containers
  • A container ignores its memory limit
  • "permission denied" talking to the Docker daemon
  • The panel can't reach the backend
  • CUDA out of memory immediately on start
  • Still stuck?