Question 1

What Is Job Scheduling and How Important Is It for HPC?

Accepted Answer

Job scheduling is the process of determining what tasks your system is running and on which resources. In an HPC system, thousands of jobs and nodes may be operating at a single time. Without job scheduling, the tasks that users are trying to perform cannot be properly matched to the available resources. HPC systems use job schedulers to manage job operations. Schedulers are programs that accept, schedule, and monitor jobs.These utilities enable operations teams to initiate and manage jobs manually or automatically through job control language statements. Manual management and monitoring are performed through a graphical user interface (GUI) or a command line interface (CLI). The purpose of job schedulers is to: Minimize the length of time jobs wait in a queue Maximize job throughput to ensure as many jobs as possible are running simultaneously Optimize resource utilization to maximize ROI When schedulers function effectively, these tools help ensure that the technical debt created by HPC systems is steadily decreased. This is essential considering the cost required to implement large scale HPC systems. Schedulers also help ensure that workloads are completed as quickly as possible, reducing bottlenecks for operations pending job results. In the case of machine learning, this means faster model training and faster time to market. What is Slurm and is it Still Relevant for Modern Workloads?

Question 2

What Is Slurm?

Accepted Answer

Slurm is an open source job scheduling tool that you can use with Linux-based clusters. It is designed to be highly-scalable, fault-tolerant, and self-contained. Slurm does not require any kernel modifications for use. When implemented, Slurm performs the following tasks: Assigns users to compute nodes. This access can be non-exclusive, with shared resources, or exclusive, with resources limited to a single user. Provides a framework for initiating, performing, and monitoring work on the assigned nodes. Work is typically managed as parallel jobs run on multiple nodes. Manages the queue of pending work and determines which job will be assigned to the node next. Slurm also includes the option for extension through plugins. You can build custom plugins through the API or use a variety that are already prepared, including for: Authentication Job completion logging Multi-category security Power management Topology based scheduling Architecture The Slurm scheduler architecture is based on a modular approach that enables you to customize your deployment to suit your infrastructure. The main component is a centralized manager (slurmctld), which monitors work and resources. This manager is backed up by a failover copy to ensure continued operations. On each compute node in your system, there is a daemon (slurmd) that is controlled by the manager. This daemon functions like a remote shell and provides hierarchical, fault-tolerant communications to other nodes and the manager. If you are using a database, there is a daemon (slurmdbd) that is used to record information across your clusters. There is also a daemon (slurmrestd) that enables you to interact with the Slurm REST API. Within Slurm, there are various user commands that you can use to manage your various components. These include: scontrol—enables you to monitor and modify cluster state and configuration information. sinfo—provides a report of system status. squeue—provides a report of job status. sacct—provides a report of running and completed jobs and job steps. srun—enables you to start jobs. sview—provides a graphic report of job status, system status, and network topology. sacctmgr—enables you to manage your database, including validating users and accounts, and identifying clusters. scancel—enables you to stop running or queued jobs. Below is a diagram of how these various components and tools interact in a Slurm deployment. Slurm v LSF v Kubernetes For more information, check out the Slurm project page: https://slurm.schedmd.com/documentation.html. Learn more about Slurm and Deep Learning.

Question 3

What Is LSF Session Scheduler?

Accepted Answer

IBM Platform Load Sharing Facility (LSF) is a platform designed for workload management in distributed, high performance computing (HPC) deployments. The LSF Session Scheduler is a scheduler for this platform that enables you to run batches of jobs on one set of resources. It offers low-latency execution of jobs based on a hierarchical scheduling model. Session Scheduler is designed specifically for managing short-duration jobs, such as job arrays with parametric execution or list processes. For mixed length or longer jobs, you are better using traditional job creation, scheduling, and initiation methods, like job chunking. The benefit of Session Scheduler is that it enables you to submit multiple tasks as a single LSF job. This ability improves the performance and throughput of the standard scheduler by reducing the number of job scheduling decisions that need to be made. Additionally, implementing Session Scheduler grants you the following benefits: Minimizes latency for short jobs Optimizes system performance and cluster utilization Assigns resources based on established LSF policies Maintains existing job starters, resource limits, and pre and post-execution programs Capable of managing more than 50k jobs per user and thousands of users When you use Session Scheduler, the process (ssched) is run similar to a parallel job and is dynamically scheduled. Each process that is created is responsible for one workload and is limited to the assigned resources. During operation, the scheduler dispatches jobs as task arrays or task definition files to the assigned execution agents until the batch is complete. Below is a diagram of how the Session Scheduler accepts jobs from the master host and dispatches tasks. IBM LSF v Kubernetes For more information, check out the documentation: https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_welcome/lsf_kc_ss.html. Slurm and Deep Learning

Question 4

What Is the Kubernetes Scheduler?

Accepted Answer

Kubernetes is a popular open source orchestration solution for container-based workloads. With Kubernetes, these workloads can be effectively managed in ways similar to traditional HPC clustering methods, though Kuberenetes alone does not offer all of the scheduling capabilities of Slurm - such as batch scheduling and gang scheduling - which we will discuss later in this post. Kubernetes is based on clusters of nodes (either physical or virtual machines) that are controlled by a master. Each node hosts a group of pods (containers). These pods share resources in the node and exist in a local network. This network enables pods to communicate with each other while still containing isolated workloads or applications. Below is a diagram of how Kubernetes Master relates to nodes and some of the components contained in each. kube-scheduler The default scheduler for Kubernetes deployments is kube-scheduler. This scheduler is run as part of the control plane. When you use Kubernetes, pods are frequently created and destroyed. When you create a new, the scheduler is needed to assign that pod to a node. Based on specified resource requirements and available resources, kube-scheduler locates a suitable node. Locating a suitable node requires filtering nodes according to scheduling requirements. Any nodes that meet the requirements (called feasible nodes) are surfaced and scored to determine the best match. Scoring is based on multiple factors, including hardware, software, and policy restraints, data locality, affinity and anti-affinity definitions, resource requirements, and inter-workload interference. Once a match is found, the pod is scheduled and a notification is sent to the API server. This notification makes the node accessible. If no nodes are available, the pod remains queued with the scheduler until a node is available. For more information, check out the documentation: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/. Understanding Slurm GPU Management

Slurm vs LSF vs Kubernetes Scheduler

Which is Right for You?

What Is Job Scheduling and How Important Is It for HPC?

What Is Slurm?

What Is LSF Session Scheduler?

What Is the Kubernetes Scheduler?

Choosing the Right Scheduler for HPC and AI Workloads

HPC Schedulers Compared: Slurm vs LSF vs Kubernetes Scheduler

kube-scheduler vs Slurm

LSF vs Kubernetes

Specific Considerations for AI Workloads

Comparison - The Bottom Line: Three Solutions Head to Head

Slurm

LSF

Kubernetes (kube-scheduler)

Automate Job Scheduling with Run:ai