ProfessorMauricioBarahona

Faculty of Natural Sciences, Department of Mathematics

Director of Research, Chair in Biomathematics

Contact

m.barahona Website

Location

6M31Huxley BuildingSouth Kensington Campus

Summary

Publications

Yu YW, Delvenne J-C, Yaliraki SN, Barahona Met al., 2020, Severability of mesoscale components and local time scales in dynamical networks

A major goal of dynamical systems theory is the search for simplifieddescriptions of the dynamics of a large number of interacting states. Foroverwhelmingly complex dynamical systems, the derivation of a reduceddescription on the entire dynamics at once is computationally infeasible. Othercomplex systems are so expansive that despite the continual onslaught of newdata only partial information is available. To address this challenge, wedefine and optimise for a local quality function severability for measuring thedynamical coherency of a set of states over time. The theoretical underpinningsof severability lie in our local adaptation of the Simon-Ando-Fisher time-scaleseparation theorem, which formalises the intuition of local wells in the Markovlandscape of a dynamical process, or the separation between a microscopic and amacroscopic dynamics. Finally, we demonstrate the practical relevance ofseverability by applying it to examples drawn from power networks, imagesegmentation, social networks, metabolic networks, and word association.

Journal article

Beaney T, Clarke J, Barahona M, Majeed Aet al., 2020, A primary care network analysis: natural communities of general practices in London, Publisher: Royal College of General Practitioners, ISSN: 0960-1643

BACKGROUND: Primary care networks (PCNs) are a new organisational hierarchy introduced in the NHS Long Term Plan with wide-ranging responsibilities. The vision is that they represent 'natural' communities of general practices with boundaries that make sense to practices, other healthcare providers, and local communities. AIM: Our study aims to identify natural communities of general practices based on patient registration patterns, using network analysis methods and unsupervised clustering to create catchments for these communities. METHOD: Patients resident in and attending GP practices in London were identified from Hospital Episode Statistics from 2017 to 2018. We used a series of novel methods for unsupervised graph clustering. A cosine similarity matrix was constructed representing similarities between each general practice to each other, based on registration of patients in each Lower Super Output Area (LSOA). Unsupervised graph partitioning using Markov Multiscale Community Detection was conducted to identify communities of general practices. Catchments were assigned to each PCN based on the majority attendance from an LSOA. RESULTS: In total 3 428 322 unique patients attended 1334 GPs in general practices LSOAs in London. The model grouped 1291 general practices (96.8%) and 4721 LSOAs (97.6%), into 165 mutually exclusive PCNs. The median PCN list size was 53 490 and a median of 70.1% of patients attended a general practice within their allocated PCN, ranging from 44.6% to 91.4%. CONCLUSION: With PCNs expected to take a role in population health management and with community providers expected to reconfigure around them, it is vital we recognise how PCNs represent their communities. This method may be used by policymakers to understand the populations and geography shared between networks.

Conference paper

Gosztolai A, Barahona M, 2020, Cellular memory enhances bacterial chemotactic navigation in rugged environments, Communications Physics, Vol: 3, ISSN: 2399-3650

The response of microbes to external signals is mediated by biochemical networks with intrinsic time scales. These time scales give rise to a memory that impacts cellular behaviour. Here we study theoretically the role of cellular memory in Escherichia coli chemotaxis. Using an agent-based model, we show that cells with memory navigating rugged chemoattractant landscapes can enhance their drift speed by extracting information from environmental correlations. Maximal advantage is achieved when the memory is comparable to the time scale of fluctuations as perceived during swimming. We derive an analytical approximation for the drift velocity in rugged landscapes that explains the enhanced velocity, and recovers standard Keller–Segel gradient-sensing results in the limits when memory and fluctuation time scales are well separated. Our numerics also show that cellular memory can induce bet-hedging at the population level resulting in long-lived, multi-modal distributions in heterogeneous landscapes.

Journal article

Peach RL, Arnaudon A, Barahona M, 2020, SEMI-SUPERVISED CLASSIFICATION ON GRAPHS USING EXPLICIT DIFFUSION DYNAMICS, FOUNDATIONS OF DATA SCIENCE, Vol: 2, Pages: 19-33

Journal article

Laumann F, von Kuegelgen J, Barahona M, 2020, Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals, ICLR 2020 - Workshop on Tackling Climate Change with Machine Learning

The United Nations' ambitions to combat climate change and prosper humandevelopment are manifested in the Paris Agreement and the SustainableDevelopment Goals (SDGs), respectively. These are inherently inter-linked asprogress towards some of these objectives may accelerate or hinder progresstowards others. We investigate how these two agendas influence each other bydefining networks of 18 nodes, consisting of the 17 SDGs and climate change,for various groupings of countries. We compute a non-linear measure ofconditional dependence, the partial distance correlation, given any subset ofthe remaining 16 variables. These correlations are treated as weights on edges,and weighted eigenvector centralities are calculated to determine the mostimportant nodes. We find that SDG 6, clean water and sanitation, and SDG 4,quality education, are most central across nearly all groupings of countries.In developing regions, SDG 17, partnerships for the goals, is stronglyconnected to the progress of other objectives in the two agendas whilst,somewhat surprisingly, SDG 8, decent work and economic growth, is not asimportant in terms of eigenvector centrality.

Conference paper

Greenbury S, Barahona M, Johnston I, 2020, HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways, Cell Systems, Vol: 10, Pages: 39-51, ISSN: 2405-4712

The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.

Journal article

Liu Z, Barahona M, 2020, Graph-based data clustering via multiscale community detection, Applied Network Science, Vol: 5, Pages: 1-20, ISSN: 2364-8228

We present a graph-theoretical approach to data clustering, which combines the creation of a graph from the data with Markov Stability, a multiscale community detection framework. We show how the multiscale capabilities of the method allow the estimation of the number of clusters, as well as alleviating the sensitivity to the parameters in graph construction. We use both synthetic and benchmark real datasets to compare and evaluate several graph construction methods and clustering algorithms, and show that multiscale graph-based clustering achieves improved performance compared to popular clustering methods without the need to set externally the number of clusters.

Journal article

Tonn MK, Thomas P, Barahona M, Oyarzún DAet al., 2020, Computation of Single-Cell Metabolite Distributions Using Mixture Models., Front Cell Dev Biol, Vol: 8, ISSN: 2296-634X

Metabolic heterogeneity is widely recognized as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. However, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.

Journal article

Hodges M, Yaliraki SN, Barahona M, 2019, Edge-based formulation of elastic network models, Physical Review Research, Pages: 033211-033211

We present an edge-based framework for the study of geometric elastic networkmodels to model mechanical interactions in physical systems. We use aformulation in the edge space, instead of the usual node-centric approach, tocharacterise edge fluctuations of geometric networks defined in d- dimensionalspace and define the edge mechanical embeddedness, an edge mechanicalsusceptibility measuring the force felt on each edge given a force applied onthe whole system. We further show that this formulation can be directly relatedto the infinitesimal rigidity of the network, which additionally permits three-and four-centre forces to be included in the network description. We exemplifythe approach in protein systems, at both the residue and atomistic levels ofdescription.

Journal article

Schaub MT, Delvenne J-C, Lambiotte R, Barahona Met al., 2019, Structured networks and coarse-grained descriptions: a dynamical perspective, Advances in Network Clustering and Blockmodeling, Editors: Doreian, Batagelj, Ferligoj, Publisher: John Wiley and Sons, Ltd, Pages: 333-361, ISBN: 9781119224709

This chapter discusses the interplay between structure and dynamics in complex networks. Given a particular network with an endowed dynamics, our goal is to find partitions aligned with the dynamical process acting on top of the network. We thus aim to gain a reduced description of the system that takes into account both its structure and dynamics. In the first part, we introduce the general mathematical setup for the types of dynamics we consider throughout the chapter. We provide two guiding examples, namely consensus dynamics and diffusion processes (random walks), motivating their connection to social network analysis, and provide a brief discussion on the general dynamical framework and its possible extensions. In the second part, we focus on the influence of graph structure on the dynamics taking place on the network, focusing on three concepts that allow us to gain insight into this notion. First, we describe how time scale separation can appear in the dynamics on a network as a consequence of graph structure. Second, we discuss how the presence of particular symmetries in the network give rise to invariant dynamical subspaces that can be precisely described by graph partitions. Third, we show how this dynamical viewpoint can be extended to study dynamics on networks with signed edges, which allow us to discuss connections to concepts in social network analysis, such as structural balance. In the third part, we discuss how to use dynamical processes unfolding on the network to detect meaningful network substructures. We then show how such dynamical measures can be related to seemingly different algorithm for community detection and coarse-graining proposed in the literature. We conclude with a brief summary and highlight interesting open future directions.

Book chapter

Peach RL, Saman D, Yaliraki SN, Klug DR, Ying L, Willison KR, Barahona Met al., 2019, Unsupervised Graph-Based Learning Predicts Mutations That Alter Protein Dynamics

<jats:title>A<jats:sc>bstract</jats:sc></jats:title><jats:p>Proteins exhibit complex dynamics across a vast range of time and length scales, from the atomistic to the conformational. Adenylate kinase (ADK) showcases the biological relevance of such inherently coupled dynamics across scales: single mutations can affect large-scale protein motions and enzymatic activity. Here we present a combined computational and experimental study of multiscale structure and dynamics in proteins, using ADK as our system of choice. We show how a computationally efficient method for unsupervised graph partitioning can be applied to atomistic graphs derived from protein structures to reveal intrinsic, biochemically relevant substructures at all scales, without re-parameterisation or<jats:italic>a priori</jats:italic>coarse-graining. We subsequently perform full alanine and arginine<jats:italic>in silico</jats:italic>mutagenesis scans of the protein, and score all mutations according to the disruption they induce on the large-scale organisation. We use our calculations to guide Förster Resonance Energy Transfer (FRET) experiments on ADK, and show that mutating residue D152 to alanine or residue V164 to arginine induce a large dynamical shift of the protein structure towards a closed state, in accordance with our predictions. Our computations also predict a graded effect of different mutations at the D152 site as a result of increased coherence between the core and binding domains, an effect confirmed quantitatively through a high correlation (<jats:italic>R</jats:italic><jats:sup>2</jats:sup>= 0.93) with the FRET ratio between closed and open populations measured on six mutants.</jats:p>

Journal article

Saavedra-Garcia P, Al-Sadah HA, Penfold L, Xiong X, Lopez-Jimenez E, Parzych K, Caputo VS, Blighe K, Kaiser MF, Piazza P, Encheva V, Snijders AP, Keun HC, Oyarzun D, Thiel D, Liu Z, Barahona M, Auner HWet al., 2019, Integrated Systems Level Examination of Proteasome Inhibitor Stress Recovery in Myeloma Cells Reveals Druggable Vulnerabilities Linked to Multiple Metabolic Processes, 61st Annual Meeting and Exposition of the American-Society-of-Hematology (ASH), Publisher: AMER SOC HEMATOLOGY, ISSN: 0006-4971

Conference paper

Peach R, Yaliraki S, Lefevre D, Barahona Met al., 2019, Data-driven unsupervised clustering of online learner behaviour , npj Science of Learning, Vol: 4, ISSN: 2056-7936

The widespread adoption of online courses opens opportunities for analysing learner behaviour and optimising web-based learning adapted to observed usage. Here we introduce a mathematical framework for the analysis of time series of online learner engagement, which allows the identification of clusters of learners with similar online temporal behaviour directly from the raw data without prescribing a priori subjective reference behaviours. The method uses a dynamic time warping kernel to create a pairwise similarity between time series of learner actions, and combines it with an unsupervised multiscale graph clustering algorithm to identify groups of learners with similar temporal behaviour. To showcase our approach, we analyse task completion data from a cohort of learners taking an online post-graduate degree at Imperial Business School. Our analysis reveals clusters of learners with statistically distinct patterns of engagement, from distributed to massed learning, with different levels of regularity, adherence to pre-planned course structure and task completion. The approach also reveals outlier learners with highly sporadic behaviour. A posteriori comparison against student performance shows that, whereas high performing learners are spread across clusters with diverse temporal engagement, low performers are located significantly in the massed learning cluster, and our unsupervised clustering identifies low performers more accurately than common machine learning classification methods trained on temporal statistics of the data. Finally, we test the applicability of the method by analysing two additional datasets: a different cohort of the same course, and time series of different format from another university.

Journal article

Altuncu MT, Sorin E, Symons JD, Mayer E, Yaliraki SN, Toni F, Barahona Met al., 2019, Extracting information from free text through unsupervised graph-based clustering: an application to patient incident records

The large volume of text in electronic healthcare records often remainsunderused due to a lack of methodologies to extract interpretable content. Herewe present an unsupervised framework for the analysis of free text thatcombines text-embedding with paragraph vectors and graph-theoretical multiscalecommunity detection. We analyse text from a corpus of patient incident reportsfrom the National Health Service in England to find content-based clusters ofreports in an unsupervised manner and at different levels of resolution. Ourunsupervised method extracts groups with high intrinsic textual consistency andcompares well against categories hand-coded by healthcare personnel. We alsoshow how to use our content-driven clusters to improve the supervisedprediction of the degree of harm of the incident based on the text of thereport. Finally, we discuss future directions to monitor reports over time, andto detect emerging trends outside pre-existing categories.

Book chapter

Kuntz Nussio J, Thomas P, Stan GB, Barahona Met al., 2019, Bounding the stationary distributions of the chemical master equation via mathematical programming, Journal of Chemical Physics, Vol: 151, ISSN: 0021-9606

The stochastic dynamics of biochemical networks are usually modelled with the chemical master equation (CME). The stationary distributions of CMEs are seldom solvable analytically, and numerical methods typically produce estimates with uncontrolled errors. Here, we introduce mathematical programming approaches that yield approximations of these distributions with computable error bounds which enable the verification of their accuracy. First, we use semidefinite programming to compute increasingly tighter upper and lower bounds on the moments of the stationary distributions for networks with rational propensities. Second, we use these moment bounds to formulate linear programs that yield convergent upper and lower bounds on the stationary distributions themselves, their marginals and stationary averages. The bounds obtained also provide a computational test for the uniqueness of the distribution. In the unique case, the bounds form an approximation of the stationary distribution with a computable bound on its error. In the non unique case, our approach yields converging approximations of the ergodic distributions. We illustrate our methodology through several biochemical examples taken from the literature: Schl¨ogl’s model for a chemical bifurcation, a two-dimensional toggle switch, a model for bursty gene expression, and a dimerisation model with multiple stationary distributions.

Journal article

Johnston I, Hoffmann T, Greenbury S, Cominetti O, Jallow M, Kwiatkowski D, Barahona M, Jones N, Casals-Pascual Cet al., 2019, Precision identification of high-risk phenotypes and progression pathways in severe malaria without requiring longitudinal data, npj Digital Medicine, Vol: 2, ISSN: 2398-6352

More than 400,000 deaths from severe malaria (SM) are reported every year, mainly in African children. The diversity of clinical presentations associated with SM indicates important differences in disease pathogenesis that require specific treatment, and this clinical heterogeneity of SM remains poorly understood. Here, we apply tools from machine learning and model-based inference to harness large-scale data and dissect the heterogeneity in patterns of clinical features associated with SM in 2904 Gambian children admitted to hospital with malaria. This quantitative analysis reveals features predicting the severity of individual patient outcomes, and the dynamic pathways of SM progression, notably inferred without requiring longitudinal observations. Bayesian inference of these pathways allows us assign quantitative mortality risks to individual patients. By independently surveying expert practitioners, we show that this data-driven approach agrees with and expands the current state of knowledge on malaria progression, while simultaneously providing a data-supported framework for predicting clinical risk.

Journal article

Maes A, Barahona M, Clopath C, 2019, Learning spatiotemporal signals using a recurrent spiking network that discretizes time, PLOS Computational Biology, Vol: 16, Pages: e1007606-e1007606

<jats:title>Abstract</jats:title><jats:p>Learning to produce spatiotemporal sequences is a common task the brain has to solve. The same neural substrate may be used by the brain to produce different sequential behaviours. The way the brain learns and encodes such tasks remains unknown as current computational models do not typically use realistic biologically-plausible learning. Here, we propose a model where a spiking recurrent network of excitatory and inhibitory biophysical neurons drives a read-out layer: the dynamics of the recurrent network is constrained to encode time while the read-out neurons encode space. Space is then linked with time through plastic synapses that follow common Hebbian learning rules. We demonstrate that the model is able to learn spatiotemporal dynamics on a timescale that is behaviourally relevant. Learned sequences are robustly replayed during a regime of spontaneous activity.</jats:p><jats:sec><jats:title>Author summary</jats:title><jats:p>The brain has the ability to learn flexible behaviours on a wide range of time scales. Previous studies have successfully build spiking network models that learn a variety of computational tasks. However, often the learning involved is not local. Here, we investigate a model using biological-plausible plasticity rules for a specific computational task: spatiotemporal sequence learning. The architecture separates time and space into two different parts and this allows learning to bind space to time. Importantly, the time component is encoded into a recurrent network which exhibits sequential dynamics on a behavioural time scale. This network is then used as an engine to drive spatial read-out neurons. We demonstrate that the model can learn complicated spatiotemporal spiking dynamics, such as the song of a bird, and replay the song robustly.</jats:p></jats:sec>

Journal article

Prischi F, Chrysostomou S, Roy R, Chapman K, Mufti U, Peach R, Ding L, Mauri F, Bellezza G, Cagini L, Barbareschi M, Ferrero S, Abrahams J, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Billingham L, Cullen M, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Targeting RSK4 prevents both chemoresistance and metastasis in lung and bladder cancer, FEBS Open Bio, Publisher: WILEY, Pages: 330-330, ISSN: 2211-5463

Conference paper

Chrysostomou S, Roy R, Prischi F, Chapman K, Mufti U, Mauri F, Bellezza G, Abrahams J, Ottaviani S, Castellano L, Giamas G, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Abstract 1775: Targeting RSK4 prevents both chemoresistance and metastasis in lung cancer, AACR Annual Meeting on Bioinformatics, Convergence Science, and Systems Biology, Publisher: American Association for Cancer Research, Pages: 1-2, ISSN: 0008-5472

Lung cancer is the commonest cause of cancer death worldwide with a five-year survival rate of less than five percent for metastatic tumors. Non-small cell lung cancer (NSCLC) accounts for 80% of lung cancer cases of which adenocarcinoma prevails. Patients almost invariably develop metastatic drug-resistant disease and this is responsible for our failure to provide curative therapy. Hence, a better understanding of the mechanisms underlying these biological processes is urgently required to improve clinical outcome.The 90-kDa ribosomal S6 kinases (RSKs) are downstream effectors of the RAS/MAPK cascade. RSKs are highly conserved serine/threonine protein kinases implicated in diverse cellular processes, including cell survival, proliferation, migration and invasion. Four isoforms exist in humans (RSK1-4) and are uniquely characterized by the presence of two non-identical N- and C-terminal kinase domains. RSK isoforms are 73-80% identical at protein level and this has been thought to suggest overlapping functions.However, through functional genomic kinome screens, we show that RSK4, contrary to RSK1, promotes both drug resistance and metastasis in lung cancer. This kinase is overexpressed in the majority (57%) of NSCLC biopsies and this correlates with poor overall survival in lung adenocarcinoma patients. Genetic silencing of RSK4 sensitizes lung cancer cells to chemotherapy and prevents their migration and invasiveness in vitro and in vivo. RSK4 downregulation decreases the anti-apoptotic proteins Bcl2 and cIAP1/2 which correlates with increased apoptotic signalling, whilst it also induces mesenchymal-epithelial transition (MET) through inhibition of NFκB activity. A small-molecule inhibitor screen identified several floxacins, including trovafloxacin, as potent allosteric inhibitors of RSK4 activation. Trovafloxacin reproduced all biological and molecular effects of RSK4 silencing in vitro and in vivo, and is predicted to bind a novel allosteric site revealed

Conference paper

Schaub MT, Delvenne JC, Lambiotte R, Barahona Met al., 2019, Multiscale dynamical embeddings of complex networks, Physical Review E, Vol: 99, Pages: 062308-1-062308-18, ISSN: 1539-3755

Complex systems and relational data are often abstracted as dynamical processes on networks. To understand, predict, and control their behavior, a crucial step is to extract reduced descriptions of such networks. Inspired by notions from control theory, we propose a time-dependent dynamical similarity measure between nodes, which quantifies the effect a node-input has on the network. This dynamical similarity induces an embedding that can be employed for several analysis tasks. Here we focus on (i) dimensionality reduction, i.e., projecting nodes onto a low-dimensional space that captures dynamic similarity at different timescales, and (ii) how to exploit our embeddings to uncover functional modules. We exemplify our ideas through case studies focusing on directed networks without strong connectivity and signed networks. We further highlight how certain ideas from community detection can be generalized and linked to control theory, by using the here developed dynamical perspective.

Journal article

Warren L, Clarke J, Arora S, Barahona M, Arebi N, Darzi Aet al., 2019, Transitions of care across hospital settings in patients with inflammatory bowel disease, World Journal of Gastroenterology, Vol: 25, Pages: 2122-2132, ISSN: 1007-9327

BACKGROUNDInflammatory bowel disease (IBD) is a chronic, inflammatory disorder characterised by both intestinal and extra-intestinal pathology. Patients may receive both emergency and elective care from several providers, often in different hospital settings. Poorly managed transitions of care between providers can lead to inefficiencies in care and patient safety issues. To ensure that the sharing of patient information between providers is appropriate, timely, accurate and secure, effective data-sharing infrastructure needs to be developed. To optimise inter-hospital data-sharing for IBD patients, we need to better understand patterns of hospital encounters in this group.AIMTo determine the type and location of hospital services accessed by IBD patients in England.METHODSThis was a retrospective observational study using Hospital Episode Statistics, a large administrative patient data set from the National Health Service in England. Adult patients with a diagnosis of IBD following admission to hospital were followed over a 2-year period to determine the proportion of care accessed at the same hospital providing their outpatient IBD care, defined as their ‘home provider’. Secondary outcome measures included the geographic distribution of patient-sharing, regional and age-related differences in accessing services, and type and frequency of outpatient encounters.RESULTSOf 95055 patients accessed hospital services on 1760156 occasions over a 2-year follow-up period. The proportion of these encounters with their identified IBD ‘home provider’ was 73.3%, 87.8% and 83.1% for accident and emergency, inpatient and outpatient encounters respectively. Patients living in metropolitan centres and younger patients were less likely to attend their ‘home provider’ for hospital services. The most commonly attended specialty services were gastroenterology, general surgery and ophthalmology.CONCLUSIONTransitions of care between secondary care sett

Journal article

Clarke JM, Barahona M, Darzi AW, 2019, Defining Hospital Catchment Areas Using Multiscale Community Detection: A Case Study for Planned Orthopaedic Care in England

<jats:title>Abstract</jats:title><jats:sec><jats:title>Background</jats:title><jats:p>The English National Health Service 5-year Forward View emphasises the importance of integration of hospital and community services. Understanding the population a hospital serves is critical to formulating strategies for community engagement and determining their accountability for populations. Existing methods to define catchment areas are unable to adapt to dilute health care markets in urban areas where populations may interact with several different hospitals. Formulating catchment areas which permit the inclusion of more than one hospital based upon patient behaviour allows for collaboration between hospitals to reach out into the communities they collectively share.</jats:p></jats:sec><jats:sec><jats:title>Method</jats:title><jats:p>The proportion of presentations from all census Middle Super Output Areas (MSOAs) to every hospital trust providing orthopaedic care in England were calculated. The cosine similarity of all MSOAs to one another was computed from these proportions. Multiscale community detection was applied to planned orthopaedic surgical admissions in England from 1st April 2011 to 31st March 2015. Stable community configurations were identified and the proportion of patients presenting to hospitals located within the catchment area in which they resided was calculated. The performance of these catchment areas was compared to conventional methods for assigning mutually exclusive catchment areas.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>2,602,066 planned orthopaedic surgical admissions were identified for patients resident in 6,791 MSOAs in England attending 140 different hospital trusts. Markov multiscale community detection revealed five stable catchment area configurations consisting of 127, 51, 26, 15 and 11 catchment areas. B

Journal article

Kuntz J, Thomas P, Stan G-B, Barahona Met al., 2019, The exit time finite state projection scheme: bounding exit distributions and occupation measures of continuous-time Markov chains, SIAM Journal on Scientific Computing, Vol: 41, Pages: A748-A769, ISSN: 1064-8275

We introduce the exit time finite state projection (ETFSP) scheme, a truncation- based method that yields approximations to the exit distribution and occupation measure associated with the time of exit from a domain (i.e., the time of first passage to the complement of the domain) of time-homogeneous continuous-time Markov chains. We prove that: (i) the computed approximations bound the measures from below; (ii) the total variation distances between the approximations and the measures decrease monotonically as states are added to the truncation; and (iii) the scheme converges, in the sense that, as the truncation tends to the entire state space, the total variation distances tend to zero. Furthermore, we give a computable bound on the total variation distance between the exit distribution and its approximation, and we delineate the cases in which the bound is sharp. We also revisit the related finite state projection scheme and give a comprehensive account of its theoretical properties. We demonstrate the use of the ETFSP scheme by applying it to two biological examples: the computation of the first passage time associated with the expression of a gene, and the fixation times of competing species subject to demographic noise.

Journal article

Tonn M, Thomas P, Barahona M, Oyarzun Det al., 2019, Stochastic modelling reveals mechanisms of metabolic heterogeneity, Communications Biology, Vol: 2, ISSN: 2399-3642

Phenotypic variation is a hallmark of cellular physiology. Metabolic heterogeneity, in particular, underpins single-cell phenomena such as microbial drug tolerance and growth variability. Much research has focussed on transcriptomic and proteomic heterogeneity, yet it remains unclear if such variation permeates to the metabolic state of a cell. Here we propose a stochastic model to show that complex forms of metabolic heterogeneity emerge from fluctuations in enzyme expression and catalysis. The analysis predicts clonal populations to split into two or more metabolically distinct subpopulations. We reveal mechanisms not seen in deterministic models, in which enzymes with unimodal expression distributions lead to metabolites with a bimodal or multimodal distribution across the population. Based on published data, the results suggest that metabolite heterogeneity may be more pervasive than previously thought. Our work casts light on links between gene expression and metabolism, and provides a theory to probe the sources of metabolite heterogeneity.

Journal article

Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2019, From free text to clusters of content in health records: An unsupervised graph partitioning approach, Applied Network Science, Vol: 4, ISSN: 2364-8228

Electronic Healthcare records contain large volumes of unstructured data in different forms. Free text constitutes a large portion of such data, yet this source of richly detailed information often remains under-used in practice because of a lack of suitable methodologies to extract interpretable contentin a timely manner. Here we apply network-theoretical tools to the analysis of free text in Hospital Patient Incident reports in the English National Health Service, to find clusters of reports in an unsupervised manner and at different levels of resolution based directly on the free text descriptions contained within them. To do so, we combine recently developed deep neural network text-embedding methodologies based on paragraph vectors with multi-scale Markov Stability community detection applied to a similarity graph of documents obtained from sparsified text vector similarities. We showcase the approach with the analysis of incident reports submitted in Imperial College Healthcare NHS Trust, London. The multiscale community structure reveals levels of meaning with different resolution in the topics of the dataset, as shown by relevant descriptive terms extracted from thegroups of records, as well as by comparing a posteriori against hand-coded categories assigned by healthcare personnel. Our content communities exhibit good correspondence with well-defined hand-coded categories, yet our results also provide further medical detail in certain areas as well asrevealing complementary descriptors of incidents beyond the external classification. We also discuss how the method can be used to monitor reports over time and across different healthcare providers, and to detect emerging trends that fall outside of pre-existing categories.

Journal article

Gosztolai A, Carrillo JA, Barahona M, 2019, Collective search with finite perception: Transient dynamics and search efficiency, Frontiers in Physics, Vol: 6, ISSN: 2296-424X

Motile organisms often use finite spatial perception of their surroundings to navigate and search their habitats. Yet standard models of search are usually based on purely local sensory information. To model how a finite perceptual horizon affects ecological search, we propose a framework for optimal navigation that combines concepts from random walks and optimal control theory. We show that, while local strategies are optimal on asymptotically long and short search times, finite perception yields faster convergence and increased search efficiency over transient time scales relevant in biological systems. The benefit of the finite horizon can be maintained by the searchers tuning their response sensitivity to the length scale of the stimulant in the environment, and is enhanced when the agents interact as a result of increased consensus within subpopulations. Our framework sheds light on the role of spatial perception and transients in search movement and collective sensing of the environment.

Journal article

Clarke JM, Warren LR, Arora S, Barahona M, Darzi AWet al., 2018, Guiding interoperable electronic health records through patient-sharing networks., NPJ digital medicine, Vol: 1, Pages: 65-65, ISSN: 2398-6352

Effective sharing of clinical information between care providers is a critical component of a safe, efficient health system. National data-sharing systems may be costly, politically contentious and do not reflect local patterns of care delivery. This study examines hospital attendances in England from 2013 to 2015 to identify instances of patient sharing between hospitals. Of 19.6 million patients receiving care from 155 hospital care providers, 130 million presentations were identified. On 14.7 million occasions (12%), patients attended a different hospital to the one they attended on their previous interaction. A network of hospitals was constructed based on the frequency of patient sharing between hospitals which was partitioned using the Louvain algorithm into ten distinct data-sharing communities, improving the continuity of data sharing in such instances from 0 to 65-95%. Locally implemented data-sharing communities of hospitals may achieve effective accessibility of clinical information without a large-scale national interoperable information system.

Journal article

O'Clery N, Yuan Y, Stan G-B, Barahona Met al., 2018, Global Network Prediction from Local Node Dynamics

The study of dynamical systems on networks, describing complex interactiveprocesses, provides insight into how network structure affects globalbehaviour. Yet many methods for network dynamics fail to cope with large orpartially-known networks, a ubiquitous situation in real-world applications.Here we propose a localised method, applicable to a broad class of dynamicalmodels on networks, whereby individual nodes monitor and store the evolution oftheir own state and use these values to approximate, via a simple computation,their own steady state solution. Hence the nodes predict their own final statewithout actually reaching it. Furthermore, the localised formulation enablesnodes to compute global network metrics without knowledge of the full networkstructure. The method can be used to compute global rankings in the networkfrom local information; to detect community detection from fast, localtransient dynamics; and to identify key nodes that compute global networkmetrics ahead of others. We illustrate some of the applications of thealgorithm by efficiently performing web-page ranking for a large internetnetwork and identifying the dynamic roles of inter-neurons in the C. Elegansneural network. The mathematical formulation is simple, widely applicable andeasily scalable to real-world datasets suggesting how local computation canprovide an approach to the study of large-scale network dynamics.

Journal article

Beguerisse M, Bosque G, Oyarzun DA, Pico J, Barahona Met al., 2018, Flux-dependent graphs for metabolic networks, npj Systems Biology and Applications, Vol: 4, ISSN: 2056-7189

Cells adapt their metabolic fluxes in response to changes in the environment. We present a framework for the systematic construction of flux-based graphs derived from organism-wide metabolic networks. Our graphs encode the directionality of metabolic flows via edges that represent the flow of metabolites from source to target reactions. The methodology can be applied in the absence of a specific biological context by modelling fluxes probabilistically, or can be tailored to different environmental conditions by incorporating flux distributions computed through constraint-based approaches such as Flux Balance Analysis. We illustrate our approach on the central carbon metabolism of Escherichia coli and on a metabolic model of human hepatocytes. The flux-dependent graphs under various environmental conditions and genetic perturbations exhibit systemic changes in their topological and community structure, which capture the re-routing of metabolic flows and the varying importance of specific reactions and pathways. By integrating constraint-based models and tools from network science, our framework allows the study of context-specific metabolic responses at a system level beyond standard pathway descriptions.

Journal article

Hodges M, Barahona M, Yaliraki SN, 2018, Allostery and cooperativity in multimeric proteins: bond-to-bond propensities in ATCase, SCIENTIFIC REPORTS, Vol: 8, ISSN: 2045-2322

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00333972&limit=30&person=true&page=3&respub-action=search.html