Automatically Match Deep Learning Jobs to the Optimal Amount
of GPU Compute
Run:AI is a proud member of the NVIDIA DGX-Ready Software program, certified to seamlessly run our GPU Compute Management Platform on NVIDIA DGX™ systems. The platform is a Kubernetes-based software solution for high-performance orchestration of containerized AI workloads on GPUs.
How Much Compute is Available?
No Need to Guess
The Run:AI platform enables GPU clusters to be utilized for different Deep Learning workloads dynamically – from build, to train, to inference. Clusters can easily be used for build and train only, for inference only, or for mixed workloads combining build, train, and inference simultaneously.
GPUs play an important role in each of the stages of Deep Learning:
- Build – interactive sessions for dev & debug. Requires on-demand, always available, GPU access but less GPU power.
- Train – jobs running to completion consuming massive computing power. Requires multi-GPU and multi-node distributed training, performance is important.
- Inference – model serving in real-time or offline. Requires access to minimal GPU power, but needs capability to auto-scale efficiently.
Dynamic, Granular Scheduling
With Run:AI, jobs at any stage get access to the compute power they need, automatically. The Kubernetes-based scheduler queues jobs and executes them according to priorities. Important jobs can preempt others based on fairness policies and jobs can go over their predefined quota if idle resources are available. Spin model inference services up and down according to demand, not guesswork.