Kubeflow Pipelines

The Basics and a Quick Tutorial

What are Kubeflow Pipelines?

Kubeflow Pipelines is a platform designed to help you build and deploy container-based machine learning (ML) workflows that are portable and scalable. Each pipeline represents an ML workflow, and includes the specifications of all inputs needed to run the pipeline, as well the outputs of all components.

This is part of our series of articles about Kubernetes architecture.

In this article, you will learn:

Common Kubeflow Use Cases

Here are three common use cases for implementation of Kubeflow Pipelines.

Deploying Models to Production

Trained models are usually compiled into a single file that sits on a server host or laptop. Next, you copy the file to a machine hosting the application, and load the model to a server process accepting network requests for model inference.

This process becomes complex when there are multiple applications requiring model inference output from a single model, especially when you need to deploy updates and initiate rollbacks.

Kubeflow lets you run updates and rollbacks across multiple applications or servers. You can update your model in one place, and ensure all client applications quickly get the updates, once the update transaction is complete.

Shared Multi Tenant ML Environment

Machine learning environments and resources often need to be shared. To enable simple and effective sharing, you need a multi tenant machine learning environment. You can create one with Kubeflow Pipelines.  

You should aim to provide each collaborator an isolated environment. Kubernetes enables scheduling and managing containers, can help you isolate workflows and keep track of pending and running jobs for each collaborator.

Running Jupyter Notebooks on GPUs

ML algorithms need a lot of power in order to quickly run through linear algebra processes. Graphics processing units (GPUs) can meet this demand, but cannot usually be found on regular laptops and desktops.

To gain access to GPUs, data scientists often leverage Jupyter Notebooks in combination with python code and dependency management with container platforms like Docker. However, this process often creates security issues because data is distributed across unauthorized platforms and services.

Kubeflow Pipelines, on the other hand, enable data scientists to build their workflow into a container and execute it in an environment authorized by the security team.

Kubeflow Pipelines Architecture

The following diagram illustrates how the Kubeflow Pipelines platform is structured.

Kubeflow Pipelines Architecture

Image Source: Kubeflow

The Kubeflow architecture is composed of the following main components and elements:

  • Python SDK—lets you use Kubernetes domain-specific language (DSL) to build a component or designate a pipeline.
  • DSL compiler—converts Python code in a pipeline into a static configuration in a YAML file
  • Pipeline Service—creates a pipeline run from a static configuration
  • Kubernetes resources—the required custom resources definition (CRD) runs the pipeline upon a call to the Kubernetes API server by the pipeline service.
  • Artifact storage—pods contain metadata, such as pipeline runs, single scalar metrics, jobs, experiments. They also hold artifacts, which may include large-scale (time series) metrics, views, and pipeline packages. Kubeflow stores metadata in a MySQL database and artifacts an artifact store. You can use metric data for sorting and filtering, and large-scale metrics for investigating the performance of a specific run or debugging a pipeline run.
  • Orchestration controllers—controllers, such as Argo Workflow for workflows that are task-driven, prompt the containers that are required to finalize the pipeline. These containers operate in Kubernetes pods running on virtual machines.
  • Persistence Agent and ML metadata—pipelines use a Persistence Agent to monitor service-created Kubernetes resources and maintain their state in the ML Metadata Service. It also records executed container sets, their inputs and outputs. I/O can be either data artifact URIs or container parameters.
  • Pipeline web server—after gathering data from multiple services, the pipeline’s web server can display live information about pipelines, such as running pipelines, pipeline run execution status, pipeline execution history, an artifacts list, and pipeline run debugging data.

Tutorial: Getting started with Kubeflow Pipelines

This quick walkthrough can help you learn how to get started with Kubeflow Pipelines. This process uses a sample that comes with Kubeflow Pipelines.

Step 1:

Deploy Kubeflow on GCP.

Step 2:

Once Kubeflow is running, you need to access the Kubeflow UI.

To access the UI, use this URL:

https://.endpoints..cloud.goog/

Once you access the UI, you should see this dashboard:

Image Source: Kubeflow

Step 3:

Choose Pipelines.

Image Source: Kubeflow

Run a Basic Pipeline

In the pipelines UI, you can find several samples to use as a baseline for quickly launching pipelines. The below walkthrough explains how to run a basic sample with Python operations, but without a ML workload.

Step 1:

In the pipeline UI, locate a sample and choose its name. For example, [Sample] Basic—Parallel Execution.

Step 2:

Choose the Create experiment option.

Step 3:

You will be shown a series of prompts. Follow the instructions to create an experiment. When you complete the process, you can create a run.

Note that each sample provides default values for all required parameters. The following screenshots assume you have already created your experiment, named it “My experiment”. From this step forward, you are working on creating a run, which is named “My first run”.

Image Source: Kubeflow

Step 4:

To create your run, choose the Start option.

Step 5:

Choose the name of the run under Experiments.

Step 6:

You can now view information about the run and drill down into elements of the compute graph.

Image Source: Kubeflow

Run an ML Pipeline

The below walkthrough explains how to run the XGBoost sample—source code is in Kubeflow Pipelines repo.

Prerequisites:

Create GCP services for the sample.

Step 1:

Enable the standard GCP APIs for Kubeflow, as well as the APIs for Cloud Storage and Dataproc.

Step 2:

To store pipeline results, create a bucket in Google Cloud Storage.

Step 3:

In the pipeline UI, choose the name of the sample: [Sample] ML - XGBoost - Training with Confusion Matrix.

Image Source: Kubeflow

Step 4:

Choose the Create experiment option.

Step 5:

You will be shown a series of prompts. Follow the instructions to create an experiment, and include the following run parameters:

  • Output—here you need to specify the Cloud Storage bucket you have previously created.
  • Project—here you need to specify the ID of your GCP project.

Step 6:

Choose the Start option to create your run.

Step 7:

In the experiments dashboard, choose the name of your run.

Step 8:

You can now explore the graph and other aspects of the run by clicking on different components of the graph and UI. Here is how your pipeline should look like once the run is complete:

Image Source: Kubeflow

Automate Kubernetes Job Scheduling with Run:AI

Run:AI’s Scheduler is a simple plug-in to Kubernetes clusters and enables optimized orchestration of high-performance containerized workloads. It adds high-performance orchestration to your containerized AI workloads. The Run:AI platform includes:

  • High-performance for scale-up infrastructures—pool resources and enable large workloads that require considerable resources to coexist efficiently with small workloads requiring fewer resources.
  • Batch scheduling—workloads can start, pause, restart, end, and then shut down, all without any manual intervention. Plus, when the container terminates, the resources are released and can be allocated to other workloads for greater system efficiency.
  • Topology awareness—inter-resource and inter-node communication enable consistent high performance of containerized workloads.
  • Gang scheduling—containers can be launched together, start together, and end together for distributed workloads that need considerable resources.

Run:AI simplifies Kubernetes scheduling for AI and HPC workloads, helping researchers accelerate their productivity and the quality of their work.

Learn more about the Run:AI Kubernetes Scheduler, or explore Kubernetes vs Slurm schedulers.