NVIDIA NGC is a repository of containerized applications you can use in deep learning, machine learning, and high performance computing (HPC) projects. These applications are optimized for running on NVIDIA GPU hardware, contain pre-trained models and Helm charts that let you deploy applications seamlessly in Kubernetes clusters.
The NGC catalog includes over 100 containers, as well as Helm charts, models, software development kits (SDKs), and other resources. Containers and supporting resources are organized into collections representing specific use cases. You can see the full catalog here.
This is part of our series of articles about NVIDIA A100.
In this article:
NVIDIA NGC is a catalog of software optimized for GPUs, which you can deploy anywhere—in an on-premises data center, in a public cloud, or on edge devices. NGC containers allow you to run data science projects “out of the box” without installing, configuring, or integrating the infrastructure.
NGC also empowers system administrators and IT teams. Using these containers, IT staff can provide the resources researchers need without needing to install, update and maintain complex infrastructure. NGC containers are pre-tested and tuned to run on GPU hardware, and are constantly updated to remain compatible with the latest hardware and compiler versions.
Related content: Read our guide to NVIDIA deep learning GPUs
NGC provides the following key features:
NVIDIA HPC SDK is a suite of tools, libraries, and compilers for developing GPU-accelerated HPC applications.
HPC SDK includes compilers such as Fortran, C, and C++, which support standard Fortran and C++, CUDA, and OpenACC directives to accelerate HPC simulation and modeling applications with GPU. GPU acceleration libraries help you maximize common HPC algorithm performance. Optimized communications libraries help enforce standards for scalable and multi-GPU system programming.
HPC SDK also offers tools to profile performance and debug HPC applications, helping to simplify their optimization and porting. Additional tools include containerization tools to help you deploy your HPC applications easily in the cloud or on-premises.
NVIDIA Clara Discovery is a feature of the Clara healthcare platform offering a range of applications, frameworks, and AI models for drug discovery. It supports GPU-accelerated computations to assist in drug research. Clara discovery combines GPU acceleration and machine learning for medical use cases such as microscopy, genomics, clinical imaging, proteomics, computational chemistry, and more.
Automatic Speech Recognition systems support a range of use cases, including providing voice commands to virtual assistants, converting the audio in a video to captions, and transcribing phone conversations into text (for example, for archiving purposes). ASR offers sophisticated speech-to-text deep learning models that can recognize and translate audio into text in real time. A successful model can tolerate a range of accents and perform well in a noisy environment, with a low word error rate (WER).
NVIDIA DeepStream SDK is a comprehensive toolkit for streaming analytics. It provides AI-based image and video analysis and multi-sensor processing, allowing you to analyze large volumes of streaming sensor data from applications. DeepStream is part of NVIDIA’s Metropolis, a platform for creating end-to-end solutions and services to transform sensor data and pixels into actionable insights.
DeepStream SDK offers hardware-accelerated plugins, or building blocks, that incorporate complex transformation and pre-processing tasks into the stream processing pipeline. It lets you focus on building a core deep neural network (DNN) or a high-value IP without designing an end-to-end solution from scratch.
Before you can run an NGC DL container, you need to enable access to NVIDIA GPUs from your Docker environment. There are three options you can use, explained below.
Once you enable GPU support in Docker, by default container images run with access to all GPUs. You can limit them to specific GPUs using the NV_GPU environment variable.
Use this method if you are running Docker version 19.03 or later.
To enable GPU support in Docker, run the following command:
sudo apt-get install -y docker nvidia-container-toolkit
You can then run deep learning containers using the regular docker run command.
Use this method if you installed the nvidia-docker2 package (version 2.0 or later of NVIDIA Docker).
To enable GPU support, run your containers as follows:
docker run --runtime=nvidia <additional options><image name>
Use this method if you installed the nvidia-docker package (version 1.0 of NVIDIA Docker).
To enable GPU support, run your containers as follows:
nvidia-docker run <options><image name>
Consider using some or all of the following Docker flags when running an NGC deep learning container:
The following code example illustrates a few of the above options. This command specifies that the container should have access to GPUs 0 and 1, sets the remove flag, and defines that the container user should be the current user.
docker run --gpus "device=0,1" -ti --rm -u $(id -u):$(id -g) nvcr.io/nvidia/<repository>:<container version>
Run:AI automates resource management and workload orchestration for machine learning infrastructure. With Run:AI, you can automatically run as many compute intensive experiments as needed on NVIDIA infrastructure.
Here are some of the capabilities you gain when using Run:AI:
Run:AI simplifies machine learning infrastructure pipelines, helping data scientists accelerate their productivity and the quality of their models.
Learn more about the Run.ai GPU virtualization platform.