The 3 multi-user Linux systems -- spitfire, hurricane and typhoon -- should not be used for compute-intensive jobs .

To support these types of jobs, a small Linux cluster is available  for all members of the Department of Aeronautics to run small parallel computational jobs, or long-running serial jobs. This may be used for undergraduate and postgraduate individual projects, as well as group design projects. These systems run the Aeronautics Linux environment. There are currently 5 dedicated nodes in the cluster:

  • linux[1-4]: 4 x Dual-socket Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz, 16 h/w cores, 64 GB RAM.
  • comet: 1 x Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM.

The system may also run some small jobs on the interactive nodes, which have the following specification:

  • Spitfire: Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM
  • Hurricane: Dual-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 40 h/w cores, 256 GB RAM
  • Typhoon: Dual-socket Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz, 48 h/w cores, 384 GB RAM, 2 x NVIDIA A40 48GB GPU

SLURM Queuing system

Jobs are submitted through a queuing system from any of the Department's Linux machines. Jobs are submitted to a scheduler through a job-submission script (e.g. called myjob.slr). A simple example to run a program in serial is given below:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem=4000MB
#SBATCH --time=06:00:00
#SBATCH --job-name=myserialjob
#SBATCH --partition=medium
./mycode

For parallel programs which need to be run using MPI, you might use something like:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --mem=63000MB
#SBATCH --time=06:00:00
#SBATCH --job-name=exp2-run3
#SBATCH --partition=medium
mpiexec ./mycode

The lines which begin with #SBATCH describe the resources that your job requires, including the number of nodes, the number of process to start in total across all nodes (ntasks), the amount of memory required (mem) and the maximum time the job will take to run. If you set the time too small, your job will be terminated when the time is reached. You can give the job a meaningful job-name, which is useful to identify it in the queue. The partition option selects which queue your job should be put in. There are four queues (information available using the sinfo command):

Name

Max time

Nodes available

Comment

test

30 min

Hurricane, spitfire, typhoon

Short test jobs, debugging

short

6 hours

Comet, linux[1-4]

Shorter 6-hour jobs

medium

1 day

Comet, linux[1-4]

Longer jobs

long

16 days

Comet, linux[1-2]

Use if necessary, but consider College HPC

For more demanding computations, please consider using the College HPC service. The Department has a dedicated queue called pqaero.

To submit the job, run

sbatch myjob.slr

You can monitor your jobs to see which are queued and which are running using

squeue