NVIDIA NGC

Features, Popular Containers & Quick Tutorial

What Is NVIDIA NGC?

NVIDIA NGC is a repository of containerized applications you can use in deep learning, machine learning, and high performance computing (HPC) projects. These applications are optimized for running on NVIDIA GPU hardware, contain pre-trained models and Helm charts that let you deploy applications seamlessly in Kubernetes clusters.

The NGC catalog includes over 100 containers, as well as Helm charts, models, software development kits (SDKs), and other resources. Containers and supporting resources are organized into collections representing specific use cases. You can see the full catalog here.

This is part of our series of articles about NVIDIA A100.

In this article:

Why Use NGC?

NVIDIA NGC is a catalog of software optimized for GPUs, which you can deploy anywhere—in an on-premises data center, in a public cloud, or on edge devices. NGC containers allow you to run data science projects “out of the box” without installing, configuring, or integrating the infrastructure.

NGC also empowers system administrators and IT teams. Using these containers, IT staff can provide the resources researchers need without needing to install, update and maintain complex infrastructure. NGC containers are pre-tested and tuned to run on GPU hardware, and are constantly updated to remain compatible with the latest hardware and compiler versions.

Related content: Read our guide to NVIDIA deep learning GPUs

NVIDIA NGC Features

NGC provides the following key features:

  • NGC Catalog—curated repository of GPU-optimized, containerized applications. Content is provided both from NVIDIA and third party software vendors.
  • Containers—self-contained units that let you deploy complex environments via one command.
  • Models—pre-trained models for a range of artificial intelligence (AI) tasks, such as natural language processing (NLP) and computer vision (CV). Models are pre-trained and can be immediately used for inference. If you have existing trained models, you can use transfer learning to fine tune NGC models.
  • Resources—reference neural architectures, documentation, and code samples that can help you get started.
  • Helm charts—Helm is a popular package manager for Kubernetes. If you are using Kubernetes, you can leverage NGC Helm charts to deploy data science environments onto Kubernetes clusters, including both GPU-applications and SDKs.
  • SDKs—tooling that makes it easy for developers to work with AI applications, providing capabilities like annotation and data labeling, customizing models, transfer learning, and deployment physically near to the client application or user for low-latency inference.
  • NGC-Ready Program—a series of tests that benchmarks the performance of servers on a variety of AI/ML workloads, using NVIDIA GPUs. It is commonly used by hardware manufacturers and public cloud providers.

NGC Catalog: Examples of Popular Containers

HPC SDK

NVIDIA HPC SDK is a suite of tools, libraries, and compilers for developing GPU-accelerated HPC applications.

HPC SDK includes compilers such as Fortran, C, and C++, which support standard Fortran and C++, CUDA, and OpenACC directives to accelerate HPC simulation and modeling applications with GPU. GPU acceleration libraries help you maximize common HPC algorithm performance. Optimized communications libraries help enforce standards for scalable and multi-GPU system programming.

HPC SDK also offers tools to profile performance and debug HPC applications, helping to simplify their optimization and porting. Additional tools include containerization tools to help you deploy your HPC applications easily in the cloud or on-premises.

Clara Discovery

NVIDIA Clara Discovery is a feature of the Clara healthcare platform offering a range of applications, frameworks, and AI models for drug discovery. It supports GPU-accelerated computations to assist in drug research. Clara discovery combines GPU acceleration and machine learning for medical use cases such as microscopy, genomics, clinical imaging, proteomics, computational chemistry, and more.

Automatic Speech Recognition (ASR)

Automatic Speech Recognition systems support a range of use cases, including providing voice commands to virtual assistants, converting the audio in a video to captions, and transcribing phone conversations into text (for example, for archiving purposes). ASR offers sophisticated speech-to-text deep learning models that can recognize and translate audio into text in real time. A successful model can tolerate a range of accents and perform well in a noisy environment, with a low word error rate (WER).

DeepStream SDK

NVIDIA DeepStream SDK is a comprehensive toolkit for streaming analytics. It provides AI-based image and video analysis and multi-sensor processing, allowing you to analyze large volumes of streaming sensor data from applications. DeepStream is part of NVIDIA’s Metropolis, a platform for creating end-to-end solutions and services to transform sensor data and pixels into actionable insights.

DeepStream SDK offers hardware-accelerated plugins, or building blocks, that incorporate complex transformation and pre-processing tasks into the stream processing pipeline. It lets you focus on building a core deep neural network (DNN) or a high-value IP without designing an end-to-end solution from scratch.

How to Run an NGC Deep Learning Framework Container

Which Command Do You Use to Run the Container?

Before you can run an NGC DL container, you need to enable access to NVIDIA GPUs from your Docker environment. There are three options you can use, explained below.

Once you enable GPU support in Docker, by default container images run with access to all GPUs. You can limit them to specific GPUs using the NV_GPU environment variable.

  1. Native GPU support

Use this method if you are running Docker version 19.03 or later.

To enable GPU support in Docker, run the following command:

sudo apt-get install -y docker nvidia-container-toolkit

You can then run deep learning containers using the regular docker run command.

  1. NVIDIA Container Runtime for Docker

Use this method if you installed the nvidia-docker2 package (version 2.0 or later of NVIDIA Docker).

To enable GPU support, run your containers as follows:

docker run --runtime=nvidia <additional options><image name>

  1. Docker Engine Utility for NVIDIA GPUs

Use this method if you installed the nvidia-docker package (version 1.0 of NVIDIA Docker).

To enable GPU support, run your containers as follows:

nvidia-docker run <options><image name>

Important Options When Running NGC Containers

Consider using some or all of the following Docker flags when running an NGC deep learning container:

Docker run flag Why you need it
-u Avoids running as root, which creates security issues. This flag lets you set the container user as a specific user defined in the operating system.
--rm Removes the container from the system after exiting.
--mount Mounts a data volume to use the deep learning container with a data set. Can be a Docker volume, a network volume, or a directory from the host operating system.
--shm-size Increases the limit on shared memory. PyTorch, the NVIDIA Optimized Deep Learning Framework, and other deep learning applications require the use of shared memory. By default, Docker containers run with 64MB shared memory—when running multiple GPUs on a DGX system, this may not be enough.
-–gpus Sets which GPUs the container should have access to. This flag is only available in Docker 19.03 and onwards (otherwise you can set this option using the NV-GPU environment variable.

The following code example illustrates a few of the above options. This command specifies that the container should have access to GPUs 0 and 1, sets the remove flag, and defines that the container user should be the current user.

docker run --gpus "device=0,1" -ti --rm -u $(id -u):$(id -g) nvcr.io/nvidia/<repository>:<container version>

GPU Virtualization with Run.AI

Run:AI automates resource management and workload orchestration for machine learning infrastructure. With Run:AI, you can automatically run as many compute intensive experiments as needed on NVIDIA infrastructure.

Here are some of the capabilities you gain when using Run:AI:

  • Advanced visibility—create an efficient pipeline of resource sharing by pooling GPU compute resources.
  • No more bottlenecks—you can set up guaranteed quotas of GPU resources, to avoid bottlenecks and optimize billing.
  • A higher level of control—Run:AI enables you to dynamically change resource allocation, ensuring each job gets the resources it needs at any given time.

Run:AI simplifies machine learning infrastructure pipelines, helping data scientists accelerate their productivity and the quality of their models.

Learn more about the Run.ai GPU virtualization platform.