NVIDIA Base Command Platform facilitates centralized, cloud-hosted management of the entire artificial intelligence (AI) development lifecycle. It provides data scientists and IT teams with ready-to-use management tools for training AI, including workflows and resource management. It allows multiple teams to share AI infrastructure without interrupting each other.
NVIDIA Base Command Platform offers a web user interface (UI) and a command-line API that enables the execution of AI workloads using right-sized resources, such as single GPU or multi-node cluster. It provides dataset management, enabling fast delivery of production-ready AI models and applications.
The platform includes a set of built-in telemetry features to help evaluate deep learning (DL) techniques, resource allocations, and workload settings. Its reporting and visibility capabilities provide organizations with insights to measure the progress of AI projects based on business goals. Team managers can use this feature to define project priorities and plans by forecasting computing capacity needs.
In this article:
NVIDIA Base Command Platform is an AI training service that helps businesses and data scientists accelerate the AI development process. It centralizes end-to-end AI training processes, including job scheduling, resource sharing, and dataset management, through an intuitive user interface (UI), command-line interface, reporting dashboard, and integrated monitoring.
Base Command provides Kubernetes, Slurm, and Jupyter Notebook environments for NVIDIA DGX systems (NVIDIA’s multi-GPU AI workstation), providing an easy-to-use scheduling and orchestration solution that meets the requirements of large enterprises. It provides access to tools teams already use, with unified management and NVIDIA support.
Data science and deep learning practitioners require optimized, ready-to-run AI software and end-to-end management of AI experiments and workflow.
NVIDIA Base Command can configure and manage AI workloads, providing unified dataset management. It ensures AI workloads run on resources of the right size, from single GPUs to large multi-node clusters. Cloud hosting management features enable a common user experience and control over NVIDIA DGX SuperPOD (a cluster of DGX workstations).
Base Command accommodates AI tools and work methods, with consistent functionality across web UI, API, and command line. A large selection of optimized, pre-built containers with deep learning frameworks, data science tools, and trained models, are available through NVIDIA NGC Catalog, allowing data scientists to build a production-ready model faster.
An AI project is multifaceted and highly iterative in nature, requiring constant fine-tuning. NVIDIA Base Command enables IT teams and AI professionals to optimize and analyze AI resources using built-in telemetry. Management can access reporting and presentation to help them analyze progress of AI initiatives and improve AI infrastructure.
Here are some common concepts of the NVIDIA Base Command Platform:
You must have an account with Base Command Platform to use this platform. The organization’s admin should create accounts for employees, which can be activated by mapping the email address to the organization’s SSO (single sign-on).
Datasets contain read-only data for repeatable, well-documented, and scalable workloads. They may be accessible enterprise-wide or by a specific team. Under the Base Command menu, you can select Datasets to view the dataset accessible to you, the organization, or the team.
Datasets are critical for deep learning jobs, providing shareable data for training and production workloads. You can mount multiple datasets to one job, while multiple users and jobs can access the same dataset simultaneously.
To mount a dataset:
A workspace is a persistent, shareable storage unit that you can mount to a job to enable concurrent access. Each workspace has an ID and (optionally) a name). Your storage quota includes workspaces.
The main purpose of a workspace is to enable data sharing between jobs, such as for re-training and checkpoints. It also facilitates collaboration between multiple users, providing a convenient place to store and sync code. You can write multiple jobs in the same workspace. Workspaces can serve as network home directories and shared storage spaces for specific teams.
To build an NGC workspace:
Under Create a Workspace, enter the workspace’s name and choose the ACE you want to attach to the workspace.
NVIDIA Deep Learning GPU: Choosing the Right GPU for Your Project
NVIDIA Deep Learning GPUs provide high processing power for training deep learning models. This article provides a review of three top NVIDIA GPUs—NVIDIA Tesla V100, GeForce RTX 2080 Ti, and NVIDIA Titan RTX.
Learn what is the NVIDIA deep learning SDK, what are the top NVIDIA GPUs for deep learning, and what best practices you should adopt when using NVIDIA GPUs.
Read more: NVIDIA Deep Learning GPU: Choosing the Right GPU for Your Project
NVIDIA DGX: Under the Hood of DGX-1, DGX-2 and A100
DGX is a line of servers and workstations built by NVIDIA, which can run large, demanding machine learning and deep learning workloads on GPUs. DGX provides a massive amount of computing power—between 1-5 PetaFLOPS in one DGX system. It also provides advanced technology for interlinking GPUs and enabling massive parallelization across thousands of GPU cores.
Get an in-depth look into three generations of the NVIDIA DGX series, including hardware architecture, software architecture, networking and scalability features.
Read more: NVIDIA DGX: Under the Hood of DGX-1, DGX-2 and A100
NVIDIA NGC: Features, Popular Containers, and a Quick Tutorial
NVIDIA NGC is a repository of containerized applications you can use in deep learning, machine learning, and high performance computing (HPC) projects. These applications are optimized for running on NVIDIA GPU hardware, contain pre-trained models and Helm charts that let you deploy applications seamlessly in Kubernetes clusters.
Learn about NVIDIA NGC, a repository of containerized applications for machine learning. See examples of containers offered on NGC and learn how to get started.
Read more: NVIDIA NGC: Features, Popular Containers, and a Quick Tutorial