Imperial Researchers and Students at CERN in Switzerland

Prospective students

If you're interested in studying this course, you can find more information on our prospectus:

Student handbook

Course overview

The MRes Machine Learning and Big Data in the Physical Sciences will cover the methodologies and specific toolkits related to research involving large data sets. The course will focus on the use of machine learning and data-science techniques in the acquisition, curation and analysis of extremely large datasets which are common-place in modern Physics research.

The challenges faced in Physics in particular, combined with both the very large datasets and data rates generated continue to make the field a unique development ground for machine learning and more generally artificial intelligence. 

The main component of this MRes is an extended (9 months) research project, starting in Term 2, where you will carry out original research embedded in a research group. You will have the opportunity to work on cutting-edge research topics, using machine learning and data science technologies to enhance that research. The project forms two thirds of the course, allowing you to fully engage with a research group within the Physics Department. You will have the opportunity to choose from a wide range of projects before being allocated during Term 1. 

In Term 1, you will take two core courses (see Module Specifications below), one in the theoretical aspects of data analysis, statistics and machine learning, and the other in the practical aspects of carrying out data analysis using commonly used packages. A personal laptop with the software needed for these taught modules installed will be provided for you. 

You will be assessed on various aspects of the course throughout the year. An indication of the timeline for these assessments can be found in this assessment timetable document.

Alongside these core aspects you will choose two FHEQ Level 6 or 7 elective modules from the Department of Physics courses list. It may also be possible to choose some electives from other departments. There is an elective that has been designed specifically for students on this MRes 'Accelerated processing for big data analysis', which you can find in the Module Specifications section below.

Course Directors

  • David Colling

    Personal details

    David Colling Programme Co-Director

    +44 (0)20 7594 7816

    Location

    Blackett 505

  • Nicholas Wardle

    Personal details

    Nicholas Wardle Programme Co-Director

    +44 (0)20 7594 3419

    Location

    Blackett 531

Course structure

The table below give a summary overview of the programme structure:

Module     Term ECTS
PHYS70021 Statistical methods for experimental physics Compulsory Autumn 7.5
 

This core module will provide the foundational understanding of the statistics behind large scale data analysis the physical sciences. The module will cover key concepts in statistical modeling, and statistical inference that are essential for understanding the methodologies behind machine learning applications to the experimental sciences. This core course is intended to provide the statistical background to the applied core course taught in the same term. 

Module spec can be found here.  

For further information see this page.

     
PHYS70022 Applied Machine Learning Compulsory Autumn 10
 

This core module will provide the hands on experience of techniques required to analyse large data sets. The course will be taught in the Python computing language and will use standard packages such as numpy, scipy, matplotlib, pandas, Scikit-Learn, Keras and Tensorflow. The course assumes no prior knowledge of Python or any of these packages. You will learn how to implement the different techniques required to analyse data (statistical techniques and machine learning techniques) through working through examples and then analysing different data sets.

Module spec can be found here.

For further information see this page.

     
PHYS70071 Accelerated processing for Big Data analysis Elective Spring 5
 

This module will cover:
 - Introduction and overview of accelerators in computing (covering typical architectures, GPU, FPGA... their features and differences)
 - Mapping applications to architecture (what architecture best suits the problem/algorithm)
 - Performance modelling (making a C++/Excel based model of the algorithm to understand in detail and model implementation in accelerator)
 - Implementation (examples of implementation of algorithms, including typical AI cases)

Module spec can be found here.

For further information see this page.

     
 

Elective modules:

either

Accelerated processing for Big Data analysis + another 7.5 ECTS level 6 or 7 module offered within Physics (or Maths/Engineering if appropriate) 

or 

one 7.5 ECTS and one 5 ECTS Level 6 or Level 7 module offered within Physics (or Maths/Engineering if appropriate) 

or 

two 7.5 ECTS Level 6 or Level 7 modules offered within Physics (or Maths/Engineering if appropriate) 

UG level 6 and 7 courses are listed here

Elective Autumn-Spring 12.5 or 15

 

Projects

The project module specification is PHYS70023: Research Project.

The list of available projects won't be published until the course starts, but you can view previous year's projects for reference.

Electives from outside the Department of Physics 

The following modules hosted by the Department of Mathematics are available as elective modules for MLBD MRes students:

Module     Term ECTS
MATH70013 Advanced Simulation Methods Elective Spring 5
 

Modern problems in statistics require sampling from complicated probability distributions defined on a variety of spaces and setups. In this module we will visit popular advanced sampling techniques, such as Importance Sampling, Markov Chain Monte Carlo, Sequential Monte Carlo. We will consider the underlying principles of each method as well as practical aspects related to implementation, computational cost, and efficiency. By the end of the module the students will be familiar with these sampling methods and will have applied them to popular models, such as Hidden Markov Models, which appear ubiquitous in many scientific disciplines.

     
MATH70079 Introduction to Statistical Finance Elective Spring 5
 

Introduction to Statistical Finance presents the fundamental concepts of quantitative finance and statistical methods that are widely used to analyse financial data. It starts off with a concise introduction to financial markets, proceeding then to the basic principles of derivatives pricing and risk measurement. Subsequently, it develops statistical models for financial time series. Finally, it explains how such models are estimated, how their goodness of fit can be assessed and how they can be used in forecasting.

     
MATH70081

Nonparametric Statistics

Elective Spring 5
 

Nonparametric methods aim to provide inference under weaker assumptions than conventional parametric methods. In this module, students will apply modern techniques to a variety of problems, such as estimating distribution functions, density estimation and nonparametric regression.

     
MATH70083 Statistical Learning for high-dimensional data Elective Spring 5
 

In this module we will develop models and tools to analyse complex and high dimensional datasets as these arise in different application fields as for example the genetics field. This will include statistical and machine learning techniques for multiple testing, penalised regression, clustering, dimensionality reduction and visualisation. The module will cover both Frequentist and Bayesian statistical approaches as well as some modern machine learning unsupervised and supervised approaches.

     

 

There are limited places available on these modules, and any module taken from outside of the Department of Physics must be first be agreed to by the relevant course directors.

Pre-course material

Take a look at our pre-course material, designed to help your prepare for starting the course.

Access the pre-course material.