Work package 5

Lead Investigators: Prof Stephen MuggletonDr Samraat Pawar
Lead Postdoctoral Researcher: Dr Alireza Tammadoni-Nezhad

The science of global warming depends on mathematical models, which can accurately predict the effects of temperature increases on ecosystems. We aim to construct predictive models from data collected during the project (see WP 1-4) using a range of logic-based machine WP5learning techniques and differential equation models. The techniques developed in this work package will also allow the integration of data across different work packages (from individual genes to entire ecosystems). For example, interaction networks will be learned from a mixture of mesocosm field experiments (WP1), microcosm robot experiments in the laboratory (WP2) as well as gene expression data from across these datasets (WP4).

Machine learning of ecological networks is an important element of our modelling approach for predicting the effects of global warming. These networks will be constructed by augmenting predictive models based on the encoding of ecological and biological background knowledge, e.g. basic rules relating to metabolism and body size as well as models linking network properties to temperature. In these networks the probability of each link represents a degree of belief in the hypothesised link, which can be used to resolve complex networks into simpler forms. Validation of models will be conducted by active learning/experimentation, i.e. active selection of training and validation data to be tested using robotic and sequencing experiments in WP2 and WP4. This will initially verify hypothetical interactions and will then direct new data collection and experimentation.

At present we are adapting an existing method of automatic construction of food webs from ecological census data, to learn interaction networks from microbial sequence data. We have used machine learning to construct different interaction networks associated with functional measures and we have started to identify potentially important hub genera that have disproportionately strong influences on the system as a whole. These approaches will be refined iteratively throughout the project, as more data become available for testing, developing and improving our models.