You can also access our individual websites (via the Members page) for further information about our research and lists of our publications.

Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Conference paper
    Beaney T, Clarke J, Barahona M, Majeed Aet al., 2020,

    A primary care network analysis: natural communities of general practices in London

    , Publisher: Royal College of General Practitioners, ISSN: 0960-1643

    BACKGROUND: Primary care networks (PCNs) are a new organisational hierarchy introduced in the NHS Long Term Plan with wide-ranging responsibilities. The vision is that they represent 'natural' communities of general practices with boundaries that make sense to practices, other healthcare providers, and local communities. AIM: Our study aims to identify natural communities of general practices based on patient registration patterns, using network analysis methods and unsupervised clustering to create catchments for these communities. METHOD: Patients resident in and attending GP practices in London were identified from Hospital Episode Statistics from 2017 to 2018. We used a series of novel methods for unsupervised graph clustering. A cosine similarity matrix was constructed representing similarities between each general practice to each other, based on registration of patients in each Lower Super Output Area (LSOA). Unsupervised graph partitioning using Markov Multiscale Community Detection was conducted to identify communities of general practices. Catchments were assigned to each PCN based on the majority attendance from an LSOA. RESULTS: In total 3 428 322 unique patients attended 1334 GPs in general practices LSOAs in London. The model grouped 1291 general practices (96.8%) and 4721 LSOAs (97.6%), into 165 mutually exclusive PCNs. The median PCN list size was 53 490 and a median of 70.1% of patients attended a general practice within their allocated PCN, ranging from 44.6% to 91.4%. CONCLUSION: With PCNs expected to take a role in population health management and with community providers expected to reconfigure around them, it is vital we recognise how PCNs represent their communities. This method may be used by policymakers to understand the populations and geography shared between networks.

  • Journal article
    Heaton LLM, Jones NS, Fricker MD, 2020,

    A mechanistic explanation of the transition to simple multicellularity in fungi.

    , Nature Communications, Vol: 11, ISSN: 2041-1723

    Development of multicellularity was one of the major transitions in evolution and occurred independently multiple times in algae, plants, animals, and fungi. However recent comparative genome analyses suggest that fungi followed a different route to other eukaryotic lineages. To understand the driving forces behind the transition from unicellular fungi to hyphal forms of growth, we develop a comparative model of osmotrophic resource acquisition. This predicts that whenever the local resource is immobile, hard-to-digest, and nutrient poor, hyphal osmotrophs outcompete motile or autolytic unicellular osmotrophs. This hyphal advantage arises because transporting nutrients via a contiguous cytoplasm enables continued exploitation of remaining resources after local depletion of essential nutrients, and more efficient use of costly exoenzymes. The model provides a mechanistic explanation for the origins of multicellular hyphal organisms, and explains why fungi, rather than unicellular bacteria, evolved to dominate decay of recalcitrant, nutrient poor substrates such as leaf litter or wood.

  • Journal article
    Gosztolai A, Barahona M, 2020,

    Cellular memory enhances bacterial chemotactic navigation in rugged environments

    , Communications Physics, Vol: 3, ISSN: 2399-3650

    The response of microbes to external signals is mediated by biochemical networks with intrinsic time scales. These time scales give rise to a memory that impacts cellular behaviour. Here we study theoretically the role of cellular memory in Escherichia coli chemotaxis. Using an agent-based model, we show that cells with memory navigating rugged chemoattractant landscapes can enhance their drift speed by extracting information from environmental correlations. Maximal advantage is achieved when the memory is comparable to the time scale of fluctuations as perceived during swimming. We derive an analytical approximation for the drift velocity in rugged landscapes that explains the enhanced velocity, and recovers standard Keller–Segel gradient-sensing results in the limits when memory and fluctuation time scales are well separated. Our numerics also show that cellular memory can induce bet-hedging at the population level resulting in long-lived, multi-modal distributions in heterogeneous landscapes.

  • Journal article
    Peach RL, Arnaudon A, Barahona M, 2020,

    Semi-supervised classification on graphs using explicit diffusion dynamics

    , Foundations of Data Science, Vol: 2, Pages: 19-33, ISSN: 2639-8001

    Classification tasks based on feature vectors can be significantly improved by including within deep learning a graph that summarises pairwise relationships between the samples. Intuitively, the graph acts as a conduit to channel and bias the inference of class labels. Here, we study classification methods that consider the graph as the originator of an explicit graph diffusion. We show that appending graph diffusion to feature-based learning as a posteriori refinement achieves state-of-the-art classification accuracy. This method, which we call Graph Diffusion Reclassification (GDR), uses overshooting events of a diffusive graph dynamics to reclassify individual nodes. The method uses intrinsic measures of node influence, which are distinct for each node, and allows the evaluation of the relationship and importance of features and graph for classification. We also present diff-GCN, a simple extension of Graph Convolutional Neural Network (GCN) architectures that leverages explicit diffusion dynamics, and allows the natural use of directed graphs. To showcase our methods, we use benchmark datasets of documents with associated citation data.

  • Journal article
    Tang W, Bertaux F, Thomas P, Stefanelli C, Saint M, Marguerat S, Shahrezaei Vet al., 2020,

    bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data

    , Bioinformatics, Vol: 36, Pages: 1174-1181, ISSN: 1367-4803

    Motivation:Normalisation of single cell RNA sequencing (scRNA-seq) data is a prerequisite to theirinterpretation. The marked technical variability, high amounts of missing observations and batch effecttypical of scRNA-seq datasets make this task particularly challenging. There is a need for an efficient andunified approach for normalisation, imputation and batch effect correction.Results:Here, we introduce bayNorm, a novel Bayesian approach for scaling and inference of scRNA-seq counts. The method’s likelihood function follows a binomial model of mRNA capture, while priorsare estimated from expression values across cells using an empirical Bayes approach. We first validateour assumptions by showing this model can reproduce different statistics observed in real scRNA-seqdata. We demonstrate using publicly-available scRNA-seq datasets and simulated expression data thatbayNorm allows robust imputation of missing values generating realistic transcript distributions that matchsingle molecule FISH measurements. Moreover, by using priors informed by dataset structures, bayNormimproves accuracy and sensitivity of differential expression analysis and reduces batch effect comparedto other existing methods. Altogether, bayNorm provides an efficient, integrated solution for global scalingnormalisation, imputation and true count recovery of gene expression measurements from scRNA-seqdata.Availability:The R package “bayNorm” is available at The code foranalysing data in this paper is available at or information:Supplementary data are available atBioinformaticsonline.

  • Journal article
    Hoffmann T, Peel L, Lambiotte R, Jones Net al., 2020,

    Community detection in networks without observing edges

    , Science Advances, Vol: 6, ISSN: 2375-2548

    We develop a Bayesian hierarchical model to identify communities of time series. Fitting the model provides an end-to-end community detection algorithmthat does not extract information as a sequence of point estimates but propagates uncertainties from the raw data to the community labels. Our approachnaturally supports multiscale community detection as well as the selection ofan optimal scale using model comparison. We study the properties of the algorithm using synthetic data and apply it to daily returns of constituents of theS&P100 index as well as climate data from US cities.

  • Journal article
    Greenbury SF, Barahona M, Johnston IG, 2020,

    HyperTraPS: Inferring Probabilistic Patterns of Trait Acquisition in Evolutionary and Disease Progression Pathways.

    , Cell Syst, Vol: 10, Pages: 39-51.e10

    The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalizable statistical platform to infer the dynamic pathways by which many, potentially interacting, traits are acquired or lost over time. We use HyperTraPS (hypercubic transition path sampling) to efficiently learn progression pathways from cross-sectional, longitudinal, or phylogenetically linked data, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. This Bayesian approach allows inclusion of prior knowledge, quantifies uncertainty in pathway structure, and allows predictions, such as which symptom a patient will acquire next. We provide visualization tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.

  • Journal article
    Maes A, Barahona M, Clopath C, 2020,

    Learning spatiotemporal signals using a recurrent spiking network that discretizes time

    , PLoS Computational Biology, Vol: 16, Pages: 1-26, ISSN: 1553-734X

    Learning to produce spatiotemporal sequences is a common task that the brain has to solve. The same neural substrate may be used by the brain to produce different sequential behaviours. The way the brain learns and encodes such tasks remains unknown as current computational models do not typically use realistic biologically-plausible learning. Here, we propose a model where a spiking recurrent network of excitatory and inhibitory biophysical neurons drives a read-out layer: the dynamics of the driver recurrent network is trained to encode time which is then mapped through the read-out neurons to encode another dimension, such as space or a phase. Different spatiotemporal patterns can be learned and encoded through the synaptic weights to the read-out neurons that follow common Hebbian learning rules. We demonstrate that the model is able to learn spatiotemporal dynamics on time scales that are behaviourally relevant and we show that the learned sequences are robustly replayed during a regime of spontaneous activity.

  • Journal article
    Liu Z, Barahona M, 2020,

    Graph-based data clustering via multiscale community detection

    , Applied Network Science, Vol: 5, Pages: 1-20, ISSN: 2364-8228

    We present a graph-theoretical approach to data clustering, which combines the creation of a graph from the data with Markov Stability, a multiscale community detection framework. We show how the multiscale capabilities of the method allow the estimation of the number of clusters, as well as alleviating the sensitivity to the parameters in graph construction. We use both synthetic and benchmark real datasets to compare and evaluate several graph construction methods and clustering algorithms, and show that multiscale graph-based clustering achieves improved performance compared to popular clustering methods without the need to set externally the number of clusters.

  • Journal article
    Tonn MK, Thomas P, Barahona M, OyarzĂșn DAet al., 2020,

    Computation of Single-Cell Metabolite Distributions Using Mixture Models.

    , Front Cell Dev Biol, Vol: 8, ISSN: 2296-634X

    Metabolic heterogeneity is widely recognized as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. However, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=219&limit=10&page=4&respub-action=search.html Current Millis: 1632714942739 Current Time: Mon Sep 27 04:55:42 BST 2021