In this section

Compute-intensive jobs

The 3 multi-user Linux systems -- spitfire, hurricane and typhoon -- should not be used for compute-intensive jobs .

To support these types of jobs, a small Linux cluster is available for all members of the Department of Aeronautics to run small parallel computational jobs, or long-running serial jobs. This may be used for undergraduate and postgraduate individual projects, as well as group design projects. These systems run the Aeronautics Linux environment. There are currently 5 dedicated nodes in the cluster:

linux[1-4]: 4 x Dual-socket Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz, 16 h/w cores, 64 GB RAM.
comet: 1 x Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM.

The system may also run some small jobs on the interactive nodes, which have the following specification:

Spitfire: Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM
Hurricane: Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM
Typhoon: Dual-socket Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz, 48 h/w cores, 384 GB RAM, 2 x NVIDIA A40 48GB GPU

SLURM Queuing system

Jobs are submitted through a queuing system from any of the Department's Linux machines. Jobs are submitted to a scheduler through a job-submission script (e.g. called myjob.slr). A simple example to run a program in serial is given below:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem=4000MB
#SBATCH --time=06:00:00
#SBATCH --job-name=myserialjob
#SBATCH --partition=medium
./mycode

For parallel programs which need to be run using MPI, you might use something like:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --mem=63000MB
#SBATCH --time=06:00:00
#SBATCH --job-name=exp2-run3
#SBATCH --partition=medium
mpiexec ./mycode

The lines which begin with #SBATCH describe the resources that your job requires, including the number of nodes, the number of process to start in total across all nodes (ntasks), the amount of memory required (mem) and the maximum time the job will take to run. If you set the time too small, your job will be terminated when the time is reached. You can give the job a meaningful job-name, which is useful to identify it in the queue. The partition option selects which queue your job should be put in. There are four queues (information available using the sinfo command):

Name	Max time	Nodes available	Comment
test	30 min	Hurricane, spitfire, typhoon	Short test jobs, debugging
short	6 hours	Comet, linux[1-4]	Shorter 6-hour jobs
medium	1 day	Comet, linux[1-4]	Longer jobs
long	16 days	Comet, linux[1-2]	Use if necessary, but consider College HPC