Naga101: A Guide to Getting Started with (OPIG) Slurm Servers

Over the past months, I’ve been working with a few new members of OPIG, which left me answering (and asking) lots of questions about working with Slurm. In this blog post, I will try to cover key, practical basics to interacting with servers that are set up on Slurm.

Over the past months, I’ve been working with a few new members of OPIG, which left me answering (and asking) lots of questions about working with Slurm. In this blog post, I will try to cover key, practical basics to interacting with servers that are set up on Slurm.

Slurm is a workload manager or job scheduler for Linux, meaning that it helps with allocating resources (eg CPUs and GPUs) on a server to users’ jobs.

To note, all of the commands and files shown here are run from a so-called ‘head’ node, from which you access Slurm servers.

1. Entering an interactive session

Unlike many other servers, you cannot access a Slurm server via ‘ssh’. Instead, you can enter an interactive (or ‘debug’) session – which, in OPIG, is limited to 30 minutes – via the srun command. This is incredibly useful for copying files, setting up environments and checking that your code runs.

srun -p servername-debug --pty --nodes=1 --ntasks-per-node=1 -t 00:30:00 --wait=0 /bin/bash

2. Submitting jobs

While the srun command is easy and helpful, many of the jobs we want to run on a server will take longer than the debug queue time limit. You can submit a job, which can then run for a longer (although typically still capped) time but is not interactive, via sbatch.

In addition to specifying which cluster (--clusters) and nodelist (-w) you would like to submit your job to, an sbatch script allows you to set a number of parameters such as the, the job name (-J), number of cpus/gpus (--cpus/gpus-per-task) and memory per cpu/gpu (--mem-per-cpu/gpu). You can also set up your sbatch script such that you are emailed at the start and end of a job.

To run your desired job from within the sbatch script, include the command you would (in an interactive session) enter in the command line: eg python /data/localhost/username/train_model.py. To note, it is important to specify absolute paths to your script and within scripts called from sbatch!

An example sbatch script is shown below. To note, lines beginning with “#SBATCH” are read (the ‘#’ does not act like a comment identifier in this case), while these lines are not read if there is a space in “#S BATCH“.

Submit your job to the Slurm queue with: sbatch name_of_sbatch_script.sh

#!/bin/bash   
#SBATCH -J jobname # Job name
#S BATCH --time=24_1:00:00              # Walltime                                      
#SBATCH --nodes=1                       # number of nodes                                       
#SBATCH --ntasks=1                      # 1 tasks                                               
#SBATCH --gres=gpu:1                    # number of gpus
#SBATCH --gpus-per-task=1               # number of gpus per task
#SBATCH --cpus-per-gpu=4                # number of cores per gpu
#SBATCH --mem-per-cpu=10000             # memory/cpu (in MB)
#S BATCH --mem-per-gpu=24000             # memory/gpu (in MB)
#SBATCH --mail-user=email@ox.ac.uk  # set email address                           
#S BATCH --mail-type=ALL                 # Spam us with everything, caution
#SBATCH --mail-type=begin               # Instead only email when job begins...
#SBATCH --mail-type=end                 # ... and ends
#SBATCH --partition=standard-priority        # Select a specific partition rather than default
#SBATCH --clusters cluster -w node.cpu.ox.ac.uk  # Provide a specific node/nodelist rather than the standard nodelist associated with the partition (useful if you have a data setup on one specific node)
#SBATCH --output=/data/localhost/user/slurm_%j.out  # Writes standard output to this file. %j is jobnumber                             
#SBATCH --error=/data/localhost/user/hummer/slurm_%j.out   # Writes error messages to this file. %j is jobnumber

source ~/.bashrc
conda activate model_env

python3 /data/localhost/user/train_model.py

If you would like to run multiple jobs without having to submit a separate sbatch script for each one, you can set multiple tasks or arrays.

Multiple tasks allows you to run multiple commands simultaneously, split across CPU/GPU resources, from a single sbatch script. To use, set the –ntasks parameter to the number of tasks you would like to run and list the tasks at the bottom of the file. It is important to separate each task with the “&” character (allowing tasks to run in parallel) and to add “wait” to the last line, to ensure that the sbatch script waits until all tasks are completed before terminating.

This will show up in the Slurm queue as a single job using ntasks * cpu/gpu-per-task resources (in the exapmle shown below, 4 tasks * 1 cpu-per-task, ie 4 CPUs).

…
#SBATCH --ntasks=4                      # 4 tasks                                               
#SBATCH --cpus-per-task=1               # number of cores per task         
…
python task1.py &
python task2.py &
python task3.py &
python task4.py &
wait

You can implement arrays by calling from a separate file which contains all of the files you would like to run (to note, this file is hosted on the Slurm server not head node). The ‘sed "${SLURM_ARRAY_TASK_ID}q;d“‘ command will read this file line by line and allow its contents to be called.

Submitting an array Sbatch script will result in the number of jobs specified with --array, with corresponding resources as set by the other lines in the SBATCH script. For example, in the example shown below, 10 Slurm jobs would be created using 1 CPU each.

With --array, you can specify a range of lines within the given file (this does not always need to start from 1). Additionally, if you would like to be kind to your coworkers, you can limit the number of jobs run at a single time using %. For example, --array=1-10%2 will only run 2 jobs at once – when jobs 1-2 are complete, jobs 3-4 will start, and so forth.

…
#SBATCH --ntasks=1                      # 1 task                                               
#SBATCH --cpus-per-task=1               # number of cores per task
#SBATCH --array=1-10         
…
job_file=`sed "${SLURM_ARRAY_TASK_ID}q;d" /data/localhost/user/list_of_job_files.txt`
python ${job_file}

3. Checking the status of your jobs

You can check the status of your jobs using squeue. This will show whether jobs are running, how long they have been running for, which node and partition they are running on, as well as the job ID .

A fancy squeue command useful for OPIG servers is:

squeue -o"%.18i %.9P %.8j %.8u %.2t %.10M %.6D %.4C %.10m %.6b %R" --sort N -M all

4. Cancelling jobs

Jobs can be cancelled simply via

scancel job_id

Where the job_id is a number (eg 12345) that can be found using squeue.

If you have multiple clusters, you can specify scancel -M all job_id.

5. Finding more information on nodes

You can find out information about nodes via two key commands.

sinfo will give you a list of all the available nodes and their status (idle, mixed, allocated), indicating whether the resources are available for use.
 

scontrol will show further detail about a specific node, including the number of CPUs and GPUs, and their associated memory, as well as whether these are allocated (ie in use and not available).

scontrol --clusters cluster_name show node=node.cpu.ox.ac.uk

6. Copying files

To copy files between the Slurm server and head node, you can use scp or rsync. Rsync is recommended as, if interrupted, it can resume copying files where it left off. These commands must be called from the Slurm server.

Copy files from Slurm server:

rsync /path/to/files/to/copy/on/slurm/serveruser@headnode:/path/to/destination/on/head/node

Copy files to Slurm server:

rsync user@headnode:/path/to/files/on/head/node/ /path/to/destination/on/slurm/server

This command can be modified using flags, for example -r (recursive, to copy a directory’s contents) and -t (which copies the date and time the file was originally created, as opposed to overwriting this information with the date it was copied).

7. Generating SSH keys

Generating SSH keys is very useful to avoid needing to enter your password each time when copying files from the server to the head node and is particularly important when copying files via sbatch (which is useful when copying takes longer than the debug queue allows).

To generate SSH keys:

• Enter an interactive session via srun
ssh-keygen -t rsa
ssh-copy-id user@headnode
– When prompted, enter password
• Check it worked
– ssh 'user@headnode'

————————————————————————————-

A few generally useful things, not specific to Slurm servers:

8. Storage space

Storage space is a hot commodity on servers! “df -h” shows how much space is used/available in different partitions, while “du -sh” (followed by a directory or file path) shows how much space is being used in a particular directory/file.

9. Setting up Miniconda

Download Miniconda from from https://docs.conda.io/en/latest/miniconda.html and then copy to your desired location (see point 6). Initialise via “bash Miniconda3-latest-Linux-x86_64.sh”

10. Setting up GitHub

To link your GitHub account to a Slurm server, you can:

• Generate an SSH key (instructions from https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent)
– ssh-keygen -t ed25519 -C "github_account_email@email.com"
– eval "$(ssh-agent -s)"
– ssh-add ~/.ssh/id_ed25519
• Add the SSH key to your GitHub account
– Log into GitHub in the browser
– Go to Settings > SSH and GPG keys
– Add “New SSH key” – enter the full line from ~/.ssh/id_ed25519.pub

Author