Question 1

What is Container Orchestration?

Accepted Answer

Containers have become widely adopted over the past decade. A lightweight alternative to virtual machines, they make it possible to package software and its dependencies in an isolated unit, which can be easily deployed in any environment. Containers are one of the foundational technologies used to build cloud native applications.

Companies that need to deploy and manage hundreds of Linux containers and hosts can benefit from container orchestration. Container orchestration can automatically deploy, manage, scale, and set up networking for large numbers of containers. Popular container orchestrators include Kubernetes, Docker Swarm, and OpenShift.

Container orchestration makes it possible to deploy applications across multiple environments without having to redesign or refactor them. Orchestrators can also be used to deploy applications in a microservices architecture, in which software is broken up into small, self-sufficient services, developed using efficient CI/CD pipelines.

Question 2

What Problems Does Container Orchestration Solve?

Accepted Answer

Scaling containers across an organization, while ensuring efficient utilization of computing resources, can be very challenging without automation.

Capabilities of container engines

Container engines like Docker, provide CLI commands for operations like pulling a container image from a repository, creating a container, and starting or stopping one or several containers. These commands are effective for managing containers on a few hosts, but they do not address the full lifecycle of containerized applications.

Additional requirements at large scale

Here are some of the challenges that need to be addressed in larger-scale containerized applications:

Automatically deploying specific quantities of containers to a set of host machines.
Understanding which hosts are underutilized, and can be used to deploy more containers, or overutilized, meaning existing containers don’t have sufficient resources.
Updating and rolling back applications running on multiple containers in different physical locations.
Load balancing application traffic between multiple containers or groups of containers.
Central user interface for managing container workloads
Defining networking for containers
Ensuring security best practices across large numbers of containers
How container orchestrators help

Container orchestrators automate all of the above activities, using a declarative approach. You define a “desired state” of your containerized application, typically using a configuration file, and the orchestrator constantly works to achieve that desired state, given the available resources.

Orchestrators can do the following automatically:

Manage the full container lifecycle
Scale containers and the underlying infrastructure
Manage service discovery and container networking
Implement security controls in a consistent way
Monitor container health and handle fault tolerance
Load balance traffic between containers
Manage optimal resource utilization of container hosts

Question 3

Container Orchestration on NVIDIA GPUs

Accepted Answer

Kubernetes can run on NVIDIA GPUs, allowing the container orchestration platform to leverage GPU acceleration. The NVIDIA device plugin enables GPU support in Kubernetes, so developers can schedule GPU resources to build and deploy applications on multi-cloud clusters.

Kubernetes has become increasingly important for developing and scaling machine learning and deep learning algorithms. If you are not a trained data scientist, containers can help simplify management and deployment of models. Containers allow you to package a model, making it more easily transferable. You don’t have to build a model from scratch every time, which can be complex and time consuming.

GPUs cannot be virtualized and allow developers to simultaneously process large data sets across heterogeneous environments, including cloud deployments and distributed networks. They can accelerate the development of data-heavy systems such as conversational AIs.

NVIDIA DGX systems support container orchestration for multiple open-source container runtimes, such as containerd, CRI-O and Docker. GPU metrics can be monitored via a monitoring stack, integrating NVIDIA DCGM with Prometheus and Grafana. You can specify attributes such as memory requirements and GPU type.

Container orchestration on NVIDIA GPUs is supported by a number of toolkits, which are continuously being developed. With the NVIDIA Container Toolkit, you can:

Build, deploy, orchestrate and monitor GPU-accelerated Docker containers
Automate container configuration using the container runtime library
Utilize Jetson edge devices running the same CUDA-X stack — for example, images pulled from NVIDIA GPU Cloud can be optimized for JetPack.
NVIDIA also offers a transfer learning toolkit that distributes pre-trained models for AI operations such as conversational AI and computer vision using Docker containers. Transfer learning allows you to transfer an existing neural network capability to a new model. Developers can use the NVIDIA GPU Cloud registry to access existing models packaged in containers.

Container Orchestration

A Guide

What is Container Orchestration?

What Problems Does Container Orchestration Solve?

Four Popular Container Orchestration Platforms

Kubernetes Container Orchestration

AWS Container Orchestration

Azure Container Orchestration

OpenShift Container Orchestration

Container Orchestration on NVIDIA GPUs

Automating Kubernetes for Machine Learning with Run:AI