Technology stack

Platform

What runs on our hardware, how it's deployed, and the software stack behind it. Customers pick the deployment model that matches their workload — we don't force one shape on everyone.

Deployment models

Four ways to run workloads.

The choice depends on isolation requirements, performance needs, and how the customer prefers to operate.

Bare-metal

Dedicated GPU servers or clusters assigned to a single customer. No hypervisor in the compute path. Suited for training runs and workloads where every microsecond matters.

VM with passthrough

KVM-based VMs with GPUs assigned through PCIe passthrough. OS-level isolation without giving up GPU performance.

Container orchestration

Kubernetes with the NVIDIA GPU Operator and NVIDIA Container Toolkit. For customers who already run AI pipelines in containers.

Multi-node clusters

Distributed training across multiple GPU nodes connected through NVIDIA fabric. For larger models and HPC workloads.

Where partial GPU allocation makes sense, we use NVIDIA Multi-Instance GPU (MIG) for hardware-level isolation between partitions. We don't share GPUs across unrelated customers without MIG, VM, or physical separation.

Software stack

Linux-native, NVIDIA-aligned.

Standard tools, no proprietary lock-in.

Host and hypervisor

Ubuntu Server LTS 24.04. KVM/QEMU with libvirt. VFIO and PCIe passthrough.

Containers and orchestration

Docker and containerd. Upstream Kubernetes. NVIDIA Container Toolkit and NVIDIA GPU Operator.

GPU partitioning

NVIDIA Multi-Instance GPU (MIG) for hardware-isolated partitions. Full-GPU passthrough where the workload demands it.

Networking and fabric

NVIDIA ConnectX-8 SuperNIC, BlueField-3 DPUs, Unified Fabric Manager (UFM) for cluster networking.

Ecosystem

AI and HPC, out of the box.

The standard NVIDIA AI ecosystem runs out of the box, plus the open-source frameworks customers already use.

NVIDIA components

CUDA Toolkit, cuDNN, NCCL, NVIDIA HPC SDK, TensorRT, TensorRT-LLM, Triton Inference Server, NeMo, NIM microservices, and the NGC container catalog.

AI / ML frameworks

PyTorch, TensorFlow, JAX, Hugging Face Transformers, DeepSpeed, Megatron-LM, vLLM, Text Generation Inference.

HPC and scientific

RAPIDS, NVIDIA Modulus, GROMACS, LAMMPS, and the broader GPU-accelerated scientific computing ecosystem.

NVIDIA AI Enterprise

Available as an optional subscription for customers who want NVIDIA-supported AI software components and certified containers.

Monitoring

Telemetry without lock-in.

NVIDIA DCGM for GPU-level telemetry. Prometheus and Grafana for everything else. Customers can plug into their own observability stacks where needed.

What we don't do

We don't run uncontrolled time-sliced GPU sharing across unrelated tenants.
We don't use VMware vSphere, Microsoft Hyper-V, or other proprietary virtualization stacks.
We don't resell GPU access onward through other operators or aggregators.