Platform
What runs on our hardware, how it's deployed, and the software stack behind it. Customers pick the deployment model that matches their workload — we don't force one shape on everyone.
Deployment models
Four ways to run workloads.
The choice depends on isolation requirements, performance needs, and how the customer prefers to operate.
Bare-metal
Dedicated GPU servers or clusters assigned to a single customer. No hypervisor in the compute path. Suited for training runs and workloads where every microsecond matters.
VM with passthrough
KVM-based VMs with GPUs assigned through PCIe passthrough. OS-level isolation without giving up GPU performance.
Container orchestration
Kubernetes with the NVIDIA GPU Operator and NVIDIA Container Toolkit. For customers who already run AI pipelines in containers.
Multi-node clusters
Distributed training across multiple GPU nodes connected through NVIDIA fabric. For larger models and HPC workloads.
Where partial GPU allocation makes sense, we use NVIDIA Multi-Instance GPU (MIG) for hardware-level isolation between partitions. We don't share GPUs across unrelated customers without MIG, VM, or physical separation.
Software stack
Linux-native, NVIDIA-aligned.
Standard tools, no proprietary lock-in.
Host and hypervisor
Ubuntu Server LTS 24.04. KVM/QEMU with libvirt. VFIO and PCIe passthrough.
Containers and orchestration
Docker and containerd. Upstream Kubernetes. NVIDIA Container Toolkit and NVIDIA GPU Operator.
GPU partitioning
NVIDIA Multi-Instance GPU (MIG) for hardware-isolated partitions. Full-GPU passthrough where the workload demands it.
Networking and fabric
NVIDIA ConnectX-8 SuperNIC, BlueField-3 DPUs, Unified Fabric Manager (UFM) for cluster networking.
Ecosystem
AI and HPC, out of the box.
The standard NVIDIA AI ecosystem runs out of the box, plus the open-source frameworks customers already use.
NVIDIA components
CUDA Toolkit, cuDNN, NCCL, NVIDIA HPC SDK, TensorRT, TensorRT-LLM, Triton Inference Server, NeMo, NIM microservices, and the NGC container catalog.
AI / ML frameworks
PyTorch, TensorFlow, JAX, Hugging Face Transformers, DeepSpeed, Megatron-LM, vLLM, Text Generation Inference.
HPC and scientific
RAPIDS, NVIDIA Modulus, GROMACS, LAMMPS, and the broader GPU-accelerated scientific computing ecosystem.
NVIDIA AI Enterprise
Available as an optional subscription for customers who want NVIDIA-supported AI software components and certified containers.
Monitoring
Telemetry without lock-in.
NVIDIA DCGM for GPU-level telemetry. Prometheus and Grafana for everything else. Customers can plug into their own observability stacks where needed.
What we don't do
- We don't run uncontrolled time-sliced GPU sharing across unrelated tenants.
- We don't use VMware vSphere, Microsoft Hyper-V, or other proprietary virtualization stacks.
- We don't resell GPU access onward through other operators or aggregators.