Virtualization Software for AI Infrastructure
What Powerful AI Could You Create With Unlimited Compute?
Run:AI's orchestration and virtualization platform optimizes compute for AI infrastructure and speeds up data science initiatives.
Get More Out of Existing GPU Infrastructure
From Fractional GPU to Multi-Node and Multi-Cluster Scale
Run:AI’s Deep Learning (DL) acceleration platform helps organizations manage GPU resource allocation and increase cluster utilization. Run:AI pools compute resources and then applies advanced scheduling to dynamically set policies and orchestrate jobs. IT gains full control over Graphics Processing Unit (GPU) utilization across nodes, clusters, and sites, while data scientists gain easy access to compute when and how they need it.
The Run:AI Platform
The diagram below shows where Run:AI software integrates with your AI software and hardware stack. Integration at each level is seamless and supports the build, train and inference stages of data science workflows.
Virtualization and orchestration layer for containerized AI workloads
Run:AI’s abstracts data science workflows from AI infrastructure, pooling all GPU resources both on premises and cloud. IT teams retain control and gain real-time visibility and data science teams have automatic access to as many resources as they need.
Guaranteed compute power when you need it
The platform is based on a virtualization layer built for containers, where multiple containers can share a single GPU, using memory isolation and shared processing cores. Users can also use more GPU than they have been allocated, as long as there are idle GPU available.
Automated policy management with the Kubernetes-based Run:AI Scheduler
The scheduler manages requests for compute using smart queuing based on preset priorities. Using gang scheduling, batch scheduling, and topology awareness, the platform can dynamically allocate resources while maximizing GPU resources.
Build, train and inference jobs get the resources they need
Run:AI pools GPU into logical environments based on data science workflows so that GPU utilization is maximized for the specific needs of the workload. Data scientists can automatically provision resources without depending on IT admins.
Simple management of complex workloads
Easily onboard new users, maintain and add new hardware to the infrastructure pool and gain visibility, including a holistic view of GPU usage and utilization.
Accelerate Time-to Market for AI
Kubernetes-based Batch System Dynamically Allocates Resources
Moving from static to dynamic allocation of resources, researchers can now use as much compute as their models need and see two to three times faster results from DL modeling. Run:AI’s unique approach adds GPU virtualization and advanced scheduling capabilities to Kubernetes, allowing researchers to work in a familiar environment and to easily launch jobs without waiting for IT to allocate resources.
Fractional GPU is a GPU sharing system for containers. Especially suited for AI workloads such as inference and lightweight training tasks, the fractional GPU system transparently gives data science and AI engineering teams the ability to run multiple workloads simultaneously on a single GPU.