Submitting Jobs#

The easiest way to submit a job to the cluster is the SBATCH command

SBATCH submits a batch script to Slurm.

Follow the SBATCH command with the path to your script file.

The batch script may contain options preceded with #SBATCH before any of your commands.

SBATCH will stop processing further #SBATCH directives once the first non-comment non-whitespace line has been reached

SBATCH will exit as soon as the script is successfully transferred to the slurm controller, and assigned a job ID.

You will not necessarily be granted resources immedietly. It may sit ina queue of pending jobs until its required resources become available.

We’ll cover the other methods at the end of this page.

Single-Threaded / Serial Job#

The simplest way of running a job via slurm is to use a batch job

Here is an example of a simple batch script that requests 1 core and runs a python script:

#!/bin/bash
#SBATCH --job-name=serial _test    # Job name
#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@lshtm.ac.uk # Where to send mail
#SBATCH --ntasks=1                    # Run on a single core
#SBATCH --mem=1gb                     # Job memory request
#SBATCH --time=00:05:00               # Time limit hrs:min:sec
#SBATCH --output=serial_test_%j.log   # Standard output and error log
pwd; hostname; date

echo "Running plot script on a single CPU core"

python3 ~/SLURM/plot.py

date

The %j in the output line tells SLURM to substitute the job id in the name of the output file.

You can also add a -e or –error line with an error file name to separate output and error logs.

Slurm merges the job’s error and output by default and saves it to an output file with a name that includes the job ID (slurm-jobid.out).

The sbatch command is used to submit a batch script to Slurm. It will reject the job at submission time if there are requests or constraints that Slurm cannot fulfil.

Parallel Job#

#!/bin/bash
#SBATCH --job-name=parallel_test      # Job name
#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@lshtm.ac.uk # Where to send mail
#SBATCH --nodes=1                     # Run all processes on a single node
#SBATCH --ntasks=16                   # Run a single task
#SBATCH --mem=1gb                     # Total memory limit
#SBATCH --time=00:05:00               # Time limit hrs:min:sec
#SBATCH --output=parallel_%j.log      # Standard output and error log
date;hostname;pwd

module load R

echo "Running R in parallel on 16 cores"

Rscript Rcode.R

date

If you have a list of independant tasks, it can be inefficiant to run then serially one by one.

Running them at the same time in parallel can significantly reduce the amount of time you job takes

For R you can use the parallel library to split your jobs and send them to multiple cores are the same time.

You can then use the –ntasks directive to request the extra cores.

Multi-Threaded SMP Job#

#!/bin/bash
#SBATCH --job-name=smp_test # Job name
#SBATCH --mail-type=END,FAIL         # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@lshtm.ac.uk# Where to send mail
#SBATCH --nodes=1                    # Run all processes on a single node
#SBATCH --ntasks=1                   # Run a single task
#SBATCH --cpus-per-task=4            # Number of CPU cores per task
#SBATCH --mem=600mb                  # Total memory limit
#SBATCH --time=00:05:00              # Time limit hrs:min:sec
#SBATCH --output=parallel_%j.log     # Standard output and error log
date;hostname;pwd

echo "Running prime number generator program on $SLURM_CPUS_ON_NODE CPU cores"

/data/training/SLURM/prime/prime

date

By default ntasks per core is set to 1. So setting the –ntasks option will request the number of cores you want for the allocation.

Queue Defaults#

The maximum time for a job to run is 168 hours (7 days). Your job will be terminated once it hits that limit.

You are limited to running 40 simultanious jobs on the cluster. Any additional jobs will be set to pending until you are below that limit

If you don’t specify the #SBATCH options in your submission script, slurm will use it’s default settings.

The time limit will be set to 1 hour, and you will be allocated 1GB per core requested.