Question 1

What is JupyterHub?

Accepted Answer

Jupyter Notebook is an open source application, used by data scientists and machine learning professionals to author and present code, explanatory text, and visualizations. JupyterHub is an open source tool that lets you host a distributed Jupyter Notebook environment. With JupyterHub, users can log in to the server, and write Python code in a web browser, without having to install software on their local machine. The Jupyter Notebook and JupyterLab interface provided by JupyterHub is the same as the Jupyter interface running locally. JupyterHub supports web browsers, tablets and smartphones. JupyterHub should not be confused with cloud-based services for running Jupyter Notebooks—such as Google Colab, Microsoft Azure Notebooks, and Binder. Users or organizations looking for a managed, hosted Jupyter Notebook solution can leverage one of these services. JupyterHub is a do-it-yourself solution that lets you install and manage your own Jupyter Notebook server. Learn more about Slurm vs LSF vs Kubernetes Scheduler.

Question 2

What’s the Difference Between Jupyter Notebook, JupyterLab, and JupyterHub?

Accepted Answer

Jupyter Notebooks provides a document specification and a graphical user interface for editing documents. Here are several aspects to know about Jupyter Notebooks: A Jupyter Notebook is a .ipynb specification document file—composed of narrative text, code cells, and outputs. A Jupyter Notebook comes with a graphical user interface—which enables you to edit .ipynb documents. Document editing is not exclusive to the Jupyter Notebook interface. You can also use alternatives like JupyterLab, Google Colab, nteract, and Kaggle. JupyterLab provides a user interface designed for interactive computing. Here are several aspects to know about JupyterLab: JupyterLab is a user interface—designed to provide extensible and flexible interactive computing. JupyterLab provides extensions—some of which are designed for Jupyter Notebooks. There are also extensions designed for specific parts of the data science pipeline. JupyterHub provides an application designed for the management of Jupyter Notebooks. Here are several aspects to know about JupyterHub: JupyterHub is an application—designed to help you manage multiple-users sessions of interactive computing. JupyterHub provides connectivity—that enables you to connect users with the infrastructure required for their sessions. JupyterHub enables remote access—to JupyterLab as well as Jupyter Notebooks. You can use this option to let multiple users gain remote access to Jupyter resources. Learn more about Slurm and Deep Learning.

Question 3

What Problem Does JupyterHub Solve?

Accepted Answer

JupyterHub enables collaboration by providing a shared platform for data scientists and relevant stakeholders. You can use JupyterHub to create a data science workflow and deploy it on your infrastructure. This level of flexibility enables you to use the tools of your choice, including Jupyter Notebooks and a python stack, and control access to resources and the environment.

Question 4

What are the Use Cases of JupyterHub?

Accepted Answer

There is a wide range of applications for JupyterHub. It is used by large data centers providing computing resources to data scientists, major research labs, large universities serving data science students and researchers, companies with extensive data science operations, and online communities that promote collaborative data science and machine learning.

JupyterHub is usually used to enable collaboration between small and large teams:

Small teams—use JupyterHub to enable sharing interactive computing resources and analytics. Small teams include research labs, data science teams, or any collaborative project.
Large teams—use JupyterHub for providing multiple users with access to corporate resources like data, hardware, and analytics programs. Lard teams include any large group of remote users like departments and large classes.

Question 5

JupyterHub Features and Capabilities

Accepted Answer

JupyterHub provides the following key capabilities:

Sets up a Jupyter Notebook or JupyterLab environment for up to tens of thousands of users—supports Kubernetes for large-scale deployments.
Supports many different languages, environments, and user interfaces, with a variety of Jupyter kernels developed by the community (see the list of available kerners). You can deliver one or more existing kernels to JupyterHub users, or develop your own.
Provides pluggable authentication, enabling flexible authentication for some or all users, using several authentication protocols including OAuth and GitHub.
Scales up by sharing the same server with multiple users, or running multiple isolated containers.
Can be deployed on any infrastructure, including public cloud providers, virtual machines, or locally on an on-premise laptop or server.

Question 6

Scaling Machine Learning Infrastructure with Run:AI

Accepted Answer

When running JupyterHub to serve a large group of data science users, you also need to maintain a machine learning infrastructure, enabling them to run experiments in an efficient and timely manner.

Run:AI automates resource management and orchestration for machine learning infrastructure. With Run:AI, you can automatically run as many compute intensive experiments as needed.

Here are some of the capabilities you gain when using Run:AI:

Advanced visibility—create an efficient pipeline of resource sharing by pooling GPU compute resources.
No more bottlenecks—you can set up guaranteed quotas of GPU resources, to avoid bottlenecks and optimize billing.
A higher level of control—Run:AI enables you to dynamically change resource allocation, ensuring each job gets the resources it needs at any given time.
Run:AI simplifies machine learning infrastructure pipelines, helping data scientists accelerate their productivity and the quality of their models.

JupyterHub

A Practical Guide

Related Articles

What is JupyterHub?

Do You Need JupyterHub?

What’s the Difference Between Jupyter Notebook, JupyterLab, and JupyterHub?

What Problem Does JupyterHub Solve?

What are the Use Cases of JupyterHub?

JupyterHub Features and Capabilities

JupyterHub Architecture

The Subsystems: Hub, Proxy, Single-User Notebook Server

JupyterHub Tutorial 1: Installing JupyterHub on Local Server

Prerequisites

How the Subsystems Interact

Installation

Start the Hub Server

JupyterHub Tutorial 2: Deploying Using Kubernetes

Prepare Configuration File

Update Repo

Install Helm Chart

Wait for JupyterHub to Deploy and Access It

Configuring User Environments in JupyterHub

Distributing Additional Packages

Configuring Jupyter and IPython for Users

Named Servers

Scaling Machine Learning Infrastructure with Run:ai