Jupyter Notebook GPU: Running on GPU Fractions with JupyterHub

Jupyter notebooks are heavily used in the data science community, especially when it comes to developing and debugging machine and deep learning workloads on GPUs. In many occasions, data scientists and researchers use notebooks to experiment with data, and prototype and build their models before optimizing them in long training sessions. In these interactive sessions, often a small amount of GPU power is needed rather than a full GPU. In such cases, sharing GPUs between multiple notebooks can be highly desirable in terms of infrastructure and cost efficiency. Run:AI Fractional GPU technology allows researchers to share GPUs and run Jupyter notebooks on fractions of GPUs.

JupyterHub brings the power of notebooks to groups of users. It gives users access to computational environments and resources without burdening the users with installation and maintenance tasks. Users – including data scientists, researchers, and students – can get their work done in their own workspaces on shared resources..

In what follows, I show how JupyterHub can be integrated into the Run:AI platform to provide researchers with a way to run their Jupyter notebooks on fractions of GPU and provide cluster administrators with a simple control plane to manage and monitor access to compute resources. Before this tutorial, you might want to check out our full guide to JupyterHub and the initial setup tutorials there.

Setting up the environment

First, I created a Run:AI cluster in GCP. I spun up an instance with 2 Tesla K80 GPUs, and installed Kubernetes and then the Run:AI software. A detailed guide for Kubernetes and Run:AI installation can be found here: https://docs.run.ai/Administrator/Cluster-Setup/cluster-install/.

Let’s install JupyterHub on our Kubernetes cluster

1. First, we need to create a namespace for the JupyterHub installation:

kubectl create namespace jhub

2. Then, we need to provide access roles:

kubectl apply -f

3. Now we will need to create storage. JupyterHub requires storage in the form of a PersistentVolume – we will use some example of a local PV

Then run:

kubectl apply -f pv-example.yaml

The JupyterHub installation will create a PersistentVolumeClaim named hub-db-dir that should be referred to by any PV you create

4. Next, we need to create a configuration file for JupyterHub.

An example configuration file for Run:AI can be found in https://raw.githubusercontent.com/run-ai/docs/master/install/jupyterhub/config.yaml.

It contains 3 sample Run:AI profiles.
Each profile can point to a specific docker image and tag, and can specify the number of GPUs that will be allocated to the Jupyter notebook.

As we have Run:AI installed on this cluster, we can also specify fractions of GPU, by adding the gpu-fraction option under extra_annotations, see example below:

- slug: "RA 0.5"
  description: "Run:AI Profile"
  display_name: "Run:AI with 0.5 GPU"
    image: jupyter/tensorflow-notebook
    tag: 3395de4db93a
      gpu-fraction: "0.5"
- default: true
  slug: "RA 1"
  description: "Run:AI Profile"
  display_name: "Run:AI with 1 GPU"
    image: jupyter/datascience-notebook
    tag: 177037d09156
      nvidia.com/gpu: "1"
- slug: "RA 2"
  description: "Run:AI Profile"
  display_name: "Run:AI with 2 GPU"
    image: jupyter/tensorflow-notebook
    tag: 3395de4db93a
      nvidia.com/gpu: "2" 

5. Download the config file from step 4, and replace <SECRET-TOKEN> with a random number generated, by running openssl rand -hex 32

6. We can now install JupyterHub on our kubernetes cluster, by running the following commands:

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
helm install jhub jupyterhub/jupyterhub -n jhub --values config.yaml

7. Lets now verify that all pods are running, by running

kubectl get pods -n jhub

8. We should now be able to access the JupyterHub UI.
To that end, run:

kubectl get service -n jhub proxy-public

Use the External IP of the service to access the service, and login with Run:AI Project name as user name. (Learn more about Run:AI projects here)

9. You should now see the 3 profiles. Choose the “Run:AI with 0.5 GPU”

A new Jupyter notebook server will now start. This process will start a new container on your kubernetes cluster, and allocate 0.5 GPU, using Run:AI Fractional GPU technology

10. That’s it! You should now see your new Jupyter notebook and can start working on your model

11. Just to see that indeed only 0.5 GPU is allocated to our Jupyter notebook, we can start a new terminal window, and then run the nvidia-smi command

You will now see that the container has an allocation of a single GPU with only 0.5 of the GPU memory

You can change your config file and add profiles for different fractions of GPUs (eg. 0.1, 0.2 0.3 etc.).

With this capability, you can have multiple researchers launching their Jupyter notebooks on the same GPU, freeing up other GPUs for other workloads!

Learn more about fractional GPUs here.

Thanks for reading!

~Guy Salton, Solutions Engineering Lead at Run:AI

Like this article?

Share on linkedin
Share on LinkedIn
Share on twitter
Share on Twitter
Share on facebook
Share on Facebook