Imperial College London

Winners announced for the DSI Seed Fund on Machine Learning


Data Science Institute

The DSI is delighted to announce the names of the recipients of the Seed Fund linked to Machine Learning Lab in Probabilistic Modelling.

Four projects have been awarded initial funding for 5 months, to foster basic research in Machine Learning.

The first project awarded, “A fully probabilistic approach to infer intra-urban air quality from limited monitoring stations using Bayesian nonparametrics”, is led by Dr Ke Han from the Department of Civil and Environmental Engineering in collaboration with Dr Shahram Heydari and Dr Audrey de Nazelle - both from the Centre for Environmental Policy. The scope of this work is to develop a transferable methodology to infer air quality for areas not covered by existing monitoring stations, using a rich set of multi-source data. Dr Shahram Heydari will investigate the applications and the extensions of Bayesian nonparametric methods as a powerful and fully probabilistic approach capable of accommodating spatiotemporal dependencies. The method will be applied to a unique dataset collected from 35 monitoring stations in Beijing for the years 2014-2016. Dr Han says that “to our knowledge this will be the first instance of such a model in the area of air quality modelling and it will be essential to identify factors associated with pollutants concentration and their effects on human health”.

Dr Marc Deisenroth has been awarded for his project on “Distributional robust adversarial training for natural language processing”.

The development of more robust natural language processes (NLP) is key to improve the effectiveness of machine learning models. With this work Dr Deisenroth, Lecturer in Statistical Machine Learning at the Department of Computing, will improve the accuracy of NLP modelling.

To improve the efficacy of Design of Experiments, Dr Ruth Misener (Senior Lecturer in the Department of Computing) has received funding for her work on “Dynamic Design of Experiments for Model Discrimination”. How do scientists identify the best model among many rival ones, and how can they justify their choice to regulatory bodies? Dr Misener explains: “How can we design and control an online experiment to identify the best model in a noisy environment? We propose to develop new model discrimination methods using probabilistic modelling that incorporates measurement error and uncertain parameters”. This exploratory work will be developed by Simon Olofsson, Early Stage Researcher in the Marie Curie ITN ModLife network.

The fourth project to receive funding is led by Prof Simon Schultz, from the Department of Bioengineering, working jointly with Dr Seth Flaxman (Department of Mathematics) and dr Stephen Brickley (Department of Life Sciences). They will work on “Developing machine learning approaches to reveal changes in whole-brain connectivity during ageing and neurodegeneration”. In the words of the authors “The Centre for Neurotechnology at Imperial College London has invested in a state-of the-art 3D imaging system. However, the tools for analysing this rich source of information are far from established”. This project will use machine learning to develop statistical algorithms that will automatically identify and trace the connections revealed by these 3D images and help determine age-related changes in the underlying brain connectivity.


Professor Yi-Ke Guo, Director of the Data Science Institute said “The Data Science Institute is at the forefront in the search for solutions to some of the greatest challenges facing our society today.
We are very excited to support our Fellows by contributing to these cutting-edge projects in the area of Machine Learning through the initial seed funding.
The high quality of the proposals and their potential impact are proof of the excellence of our research in this area”.


Anna Cupani

Anna Cupani
Faculty of Engineering

Click to expand or contract

Contact details

Show all stories by this author

Leave a comment

Your comment may be published, displaying your name as you provide it, unless you request otherwise. Your contact details will never be published.