Imperial College London

ProfessorMauricioBarahona

Faculty of Natural SciencesDepartment of Mathematics

Chair in Biomathematics
 
 
 
//

Contact

 

m.barahona Website

 
 
//

Location

 

6M31Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

149 results found

Maes A, Barahona M, Clopath C, 2019, Learning spatiotemporal signals using a recurrent spiking network that discretizes time, PLoS Computational Biology, ISSN: 1553-734X

Learning to produce spatiotemporal sequences is a common task that the brain has to solve. The same neural substrate may be used by the brain to produce different sequential behaviours. The way the brain learns and encodes such tasks remains unknown as current computational models do not typically use realistic biologically-plausible learning. Here, we propose a model where a spiking recurrent network of excitatory and inhibitory biophysical neurons drives a read-out layer: the dynamics of the driver recurrent network is trained to encode time which is then mapped through the read-out neurons to encode another dimension, such as space or a phase. Different spatiotemporal patterns can be learned and encoded through the synaptic weights to the read-out neurons that follow common Hebbian learning rules. We demonstrate that the model is able to learn spatiotemporal dynamics on time scales that are behaviourally relevant and we show that the learned sequences are robustly replayed during a regime of spontaneous activity.

Journal article

Liu Z, Barahona M, 2019, Graph-based data clustering via multiscale community detection, Applied Network Science, ISSN: 2364-8228

We present a graph-theoretical approach to data clustering, which combinesthe creation of a graph from the data with Markov Stability, a multiscalecommunity detection framework. We show how the multiscale capabilities of themethod allow the estimation of the number of clusters, as well as alleviatingthe sensitivity to the parameters in graph construction. We use both syntheticand benchmark real datasets to compare and evaluate several graph constructionmethods and clustering algorithms, and show that multiscale graph-basedclustering achieves improved performance compared to popular clustering methodswithout the need to set externally the number of clusters.

Journal article

Greenbury S, Barahona M, Johnston I, 2019, HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways, Cell Systems, ISSN: 2405-4712

The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.

Journal article

Peach RL, Saman D, Yaliraki SN, Klug DR, Ying L, Willison KR, Barahona Met al., 2019, Unsupervised Graph-Based Learning Predicts Mutations That Alter Protein Dynamics

<jats:title>A<jats:sc>bstract</jats:sc></jats:title><jats:p>Proteins exhibit complex dynamics across a vast range of time and length scales, from the atomistic to the conformational. Adenylate kinase (ADK) showcases the biological relevance of such inherently coupled dynamics across scales: single mutations can affect large-scale protein motions and enzymatic activity. Here we present a combined computational and experimental study of multiscale structure and dynamics in proteins, using ADK as our system of choice. We show how a computationally efficient method for unsupervised graph partitioning can be applied to atomistic graphs derived from protein structures to reveal intrinsic, biochemically relevant substructures at all scales, without re-parameterisation or <jats:italic>a priori</jats:italic> coarse-graining. We subsequently perform full alanine and arginine <jats:italic>in silico</jats:italic> mutagenesis scans of the protein, and score all mutations according to the disruption they induce on the large-scale organisation. We use our calculations to guide Förster Resonance Energy Transfer (FRET) experiments on ADK, and show that mutating residue D152 to alanine or residue V164 to arginine induce a large dynamical shift of the protein structure towards a closed state, in accordance with our predictions. Our computations also predict a graded effect of different mutations at the D152 site as a result of increased coherence between the core and binding domains, an effect confirmed quantitatively through a high correlation (<jats:italic>R</jats:italic><jats:sup>2</jats:sup> = 0.93) with the FRET ratio between closed and open populations measured on six mutants.</jats:p>

Journal article

Hodges M, Yaliraki SN, Barahona M, 2019, An edge-based formulation of elastic network models, Physical Review Research

We present an edge-based framework for the study of geometric elastic networkmodels to model mechanical interactions in physical systems. We use aformulation in the edge space, instead of the usual node-centric approach, tocharacterise edge fluctuations of geometric networks defined in d- dimensionalspace and define the edge mechanical embeddedness, an edge mechanicalsusceptibility measuring the force felt on each edge given a force applied onthe whole system. We further show that this formulation can be directly relatedto the infinitesimal rigidity of the network, which additionally permits three-and four-centre forces to be included in the network description. We exemplifythe approach in protein systems, at both the residue and atomistic levels ofdescription.

Journal article

Peach RL, Arnaudon A, Barahona M, 2019, Semi-supervised classification on graphs using explicit diffusion dynamics

Classification tasks based on feature vectors can be significantly improvedby including within deep learning a graph that summarises pairwiserelationships between the samples. Intuitively, the graph acts as a conduit tochannel and bias the inference of class labels. Here, we study classificationmethods that consider the graph as the originator of an explicit graphdiffusion. We show that appending graph diffusion to feature-based learning asan \textit{a posteriori} refinement achieves state-of-the-art classificationaccuracy. This method, which we call Graph Diffusion Reclassification (GDR),uses overshooting events of a diffusive graph dynamics to reclassify individualnodes. The method uses intrinsic measures of node influence, which are distinctfor each node, and allows the evaluation of the relationship and importance offeatures and graph for classification. We also present diff-GCN, a simpleextension of Graph Convolutional Neural Network (GCN) architectures thatleverages explicit diffusion dynamics, and allows the natural use of directedgraphs. To showcase our methods, we use benchmark datasets of documents withassociated citation data.

Journal article

Kuntz J, Thomas P, Stan G-B, Barahona Met al., 2019, Stationary distributions of continuous-time Markov chains: a review of theory and truncation-based approximations

Computing the stationary distributions of a continuous-time Markov chaininvolves solving a set of linear equations. In most cases of interest, thenumber of equations is infinite or too large, and cannot be solved analyticallyor numerically. Several approximation schemes overcome this issue by truncatingthe state space to a manageable size. In this review, we first give acomprehensive theoretical account of the stationary distributions and theirrelation to the long-term behaviour of the Markov chain, which is readilyaccessible to non-experts and free of irreducibility assumptions made instandard texts. We then review truncation-based approximation schemes payingparticular attention to their convergence and to the errors they introduce, andwe illustrate their performance with an example of a stochastic reactionnetwork of relevance in biology and chemistry. We conclude by elaborating oncomputational trade-offs associated with error control and some open questions.

Journal article

Peach R, Yaliraki S, Lefevre D, Barahona Met al., 2019, Data-driven unsupervised clustering of online learner behaviour , npj Science of Learning, Vol: 4, ISSN: 2056-7936

The widespread adoption of online courses opens opportunities for analysing learner behaviour and optimising web-based learning adapted to observed usage. Here we introduce a mathematical framework for the analysis of time series of online learner engagement, which allows the identification of clusters of learners with similar online temporal behaviour directly from the raw data without prescribing a priori subjective reference behaviours. The method uses a dynamic time warping kernel to create a pairwise similarity between time series of learner actions, and combines it with an unsupervised multiscale graph clustering algorithm to identify groups of learners with similar temporal behaviour. To showcase our approach, we analyse task completion data from a cohort of learners taking an online post-graduate degree at Imperial Business School. Our analysis reveals clusters of learners with statistically distinct patterns of engagement, from distributed to massed learning, with different levels of regularity, adherence to pre-planned course structure and task completion. The approach also reveals outlier learners with highly sporadic behaviour. A posteriori comparison against student performance shows that, whereas high performing learners are spread across clusters with diverse temporal engagement, low performers are located significantly in the massed learning cluster, and our unsupervised clustering identifies low performers more accurately than common machine learning classification methods trained on temporal statistics of the data. Finally, we test the applicability of the method by analysing two additional datasets: a different cohort of the same course, and time series of different format from another university.

Journal article

Altuncu MT, Sorin E, Symons JD, Mayer E, Yaliraki SN, Toni F, Barahona Met al., 2019, Extracting information from free text through unsupervised graph-based clustering: an application to patient incident records

The large volume of text in electronic healthcare records often remainsunderused due to a lack of methodologies to extract interpretable content. Herewe present an unsupervised framework for the analysis of free text thatcombines text-embedding with paragraph vectors and graph-theoretical multiscalecommunity detection. We analyse text from a corpus of patient incident reportsfrom the National Health Service in England to find content-based clusters ofreports in an unsupervised manner and at different levels of resolution. Ourunsupervised method extracts groups with high intrinsic textual consistency andcompares well against categories hand-coded by healthcare personnel. We alsoshow how to use our content-driven clusters to improve the supervisedprediction of the degree of harm of the incident based on the text of thereport. Finally, we discuss future directions to monitor reports over time, andto detect emerging trends outside pre-existing categories.

Book chapter

Gosztolai A, Barahona M, 2019, Beyond local gradient-sensing: memory effects in bacterial chemotactic navigation

<jats:p>The response of microbes to external signals is mediated by biochemical networks with intrinsic timescales. These timescales give rise to a cellular memory that plays a pivotal role in controlling cellular behaviour. Here we study the role of cellular memory in <jats:italic>Escherichia coli</jats:italic> chemotaxis. Using an agent-based model, we show that cells with memory navigating rugged chemoattractant landscapes can enhance their drift speed by extracting information from the correlations in their environment beyond local gradients. The improvement is maximal when the cellular memory is comparable to the timescale of the fluctuations perceived during swimming. We then extend coarse-grained population models and derive an analytical approximation for the drift velocity in rugged landscapes that includes the relevant time and length scales. Our model explains the improved velocity, and recovers standard Keller-Segel gradient-sensing results in the limits of no cellular memory, no ruggedness of the landscape, and when memory and fluctuation timescales are well separated. Our numerics also show that cellular memory can be used to induce bet-hedging at the population level by developing distinct sub-populations in response to heterogeneous chemoattractant landscapes.</jats:p>

Journal article

Kuntz Nussio J, Thomas P, Stan GB, Barahona Met al., 2019, Bounding the stationary distributions of the chemical master equation via mathematical programming, Journal of Chemical Physics, Vol: 151, ISSN: 0021-9606

The stochastic dynamics of biochemical networks are usually modelled with the chemical master equation (CME). The stationary distributions of CMEs are seldom solvable analytically, and numerical methods typically produce estimates with uncontrolled errors. Here, we introduce mathematical programming approaches that yield approximations of these distributions with computable error bounds which enable the verification of their accuracy. First, we use semidefinite programming to compute increasingly tighter upper and lower bounds on the moments of the stationary distributions for networks with rational propensities. Second, we use these moment bounds to formulate linear programs that yield convergent upper and lower bounds on the stationary distributions themselves, their marginals and stationary averages. The bounds obtained also provide a computational test for the uniqueness of the distribution. In the unique case, the bounds form an approximation of the stationary distribution with a computable bound on its error. In the non unique case, our approach yields converging approximations of the ergodic distributions. We illustrate our methodology through several biochemical examples taken from the literature: Schl¨ogl’s model for a chemical bifurcation, a two-dimensional toggle switch, a model for bursty gene expression, and a dimerisation model with multiple stationary distributions.

Journal article

Arnaudon A, Peach RL, Barahona M, 2019, Graph centrality is a question of scale

Classic measures of graph centrality capture distinct aspects of nodeimportance, from the local (e.g., degree) to the global (e.g., closeness). Herewe exploit the connection between diffusion and geometry to introduce amultiscale centrality measure. A node is defined to be central if it breaks themetricity of the diffusion as a consequence of the effective boundaries andinhomogeneities in the graph. Our measure is naturally multiscale, as it iscomputed relative to graph neighbourhoods within the varying time horizon ofthe diffusion. We find that the centrality of nodes can differ widely atdifferent scales. In particular, our measure correlates with degree (i.e.,hubs) at small scales and with closeness (i.e., bridges) at large scales, andalso reveals the existence of multi-centric structures in complex networks. Byexamining centrality across scales, our measure thus provides an evaluation ofnode importance relative to local and global processes on the network.

Working paper

Johnston I, Hoffmann T, Greenbury S, Cominetti O, Jallow M, Kwiatkowski D, Barahona M, Jones N, Casals-Pascual Cet al., 2019, Precision identification of high-risk phenotypes and progression pathways in severe malaria without requiring longitudinal data, npj Digital Medicine, Vol: 2, ISSN: 2398-6352

More than 400,000 deaths from severe malaria (SM) are reported every year, mainly in African children. The diversity of clinical presentations associated with SM indicates important differences in disease pathogenesis that require specific treatment, and this clinical heterogeneity of SM remains poorly understood. Here, we apply tools from machine learning and model-based inference to harness large-scale data and dissect the heterogeneity in patterns of clinical features associated with SM in 2904 Gambian children admitted to hospital with malaria. This quantitative analysis reveals features predicting the severity of individual patient outcomes, and the dynamic pathways of SM progression, notably inferred without requiring longitudinal observations. Bayesian inference of these pathways allows us assign quantitative mortality risks to individual patients. By independently surveying expert practitioners, we show that this data-driven approach agrees with and expands the current state of knowledge on malaria progression, while simultaneously providing a data-supported framework for predicting clinical risk.

Journal article

Maes A, Barahona M, Clopath C, 2019, Learning spatiotemporal signals using a recurrent spiking network that discretizes time, Publisher: Cold Spring Harbor Laboratory

<jats:title>Abstract</jats:title><jats:p>Learning to produce spatiotemporal sequences is a common task the brain has to solve. The same neural substrate may be used by the brain to produce different sequential behaviours. The way the brain learns and encodes such tasks remains unknown as current computational models do not typically use realistic biologically-plausible learning. Here, we propose a model where a spiking recurrent network of excitatory and inhibitory biophysical neurons drives a read-out layer: the dynamics of the recurrent network is constrained to encode time while the read-out neurons encode space. Space is then linked with time through plastic synapses that follow common Hebbian learning rules. We demonstrate that the model is able to learn spatiotemporal dynamics on a timescale that is behaviourally relevant. Learned sequences are robustly replayed during a regime of spontaneous activity.</jats:p><jats:sec><jats:title>Author summary</jats:title><jats:p>The brain has the ability to learn flexible behaviours on a wide range of time scales. Previous studies have successfully build spiking network models that learn a variety of computational tasks. However, often the learning involved is not local. Here, we investigate a model using biological-plausible plasticity rules for a specific computational task: spatiotemporal sequence learning. The architecture separates time and space into two different parts and this allows learning to bind space to time. Importantly, the time component is encoded into a recurrent network which exhibits sequential dynamics on a behavioural time scale. This network is then used as an engine to drive spatial read-out neurons. We demonstrate that the model can learn complicated spatiotemporal spiking dynamics, such as the song of a bird, and replay the song robustly.</jats:p></jats:sec>

Working paper

Chrysostomou S, Roy R, Prischi F, Chapman K, Mufti U, Mauri F, Bellezza G, Abrahams J, Ottaviani S, Castellano L, Giamas G, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Targeting RSK4 prevents both chemoresistance and metastasis in lung cancer, AACR Annual Meeting on Bioinformatics, Convergence Science, and Systems Biology, Publisher: AMER ASSOC CANCER RESEARCH, ISSN: 0008-5472

Conference paper

Prischi F, Chrysostomou S, Roy R, Chapman K, Mufti U, Peach R, Ding L, Mauri F, Bellezza G, Cagini L, Barbareschi M, Ferrero S, Abrahams J, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Billingham L, Cullen M, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Targeting RSK4 prevents both chemoresistance and metastasis in lung and bladder cancer, FEBS Open Bio, Publisher: WILEY, Pages: 330-330, ISSN: 2211-5463

Conference paper

Schaub MT, Delvenne JC, Lambiotte R, Barahona Met al., 2019, Multiscale dynamical embeddings of complex networks, Physical Review E, Vol: 99, Pages: 062308-1-062308-18, ISSN: 1539-3755

Complex systems and relational data are often abstracted as dynamical processes on networks. To understand, predict, and control their behavior, a crucial step is to extract reduced descriptions of such networks. Inspired by notions from control theory, we propose a time-dependent dynamical similarity measure between nodes, which quantifies the effect a node-input has on the network. This dynamical similarity induces an embedding that can be employed for several analysis tasks. Here we focus on (i) dimensionality reduction, i.e., projecting nodes onto a low-dimensional space that captures dynamic similarity at different timescales, and (ii) how to exploit our embeddings to uncover functional modules. We exemplify our ideas through case studies focusing on directed networks without strong connectivity and signed networks. We further highlight how certain ideas from community detection can be generalized and linked to control theory, by using the here developed dynamical perspective.

Journal article

Kuntz J, Thomas P, Stan G-B, Barahona Met al., 2019, Approximations of countably-infinite linear programs over bounded measure spaces

We study a class of countably-infinite-dimensional linear programs (CILPs)whose feasible sets are bounded subsets of appropriately defined weightedspaces of measures. We show how to approximate the optimal value, optimalpoints, and minimal points of these CILPs by solving finite-dimensional linearprograms. The errors of our approximations converge to zero as the size of thefinite-dimensional program approaches that of the original problem and are easyto bound in practice. We discuss the use of our methods in the computation ofthe stationary distributions, occupation measures, and exit distributions ofMarkov~chains.

Journal article

Qian Y, Expert P, Rieu T, Panzarasa P, Barahona Met al., 2019, Quantifying the alignment of graph and features in deep learning

We show that the classification performance of Graph Convolutional Networksis related to the alignment between features, graph and ground truth, which wequantify using a subspace alignment measure corresponding to the Frobenius normof the matrix of pairwise chordal distances between three subspaces associatedwith features, graph and ground truth. The proposed measure is based on theprincipal angles between subspaces and has both spectral and geometricalinterpretations. We showcase the relationship between the subspace alignmentmeasure and the classification performance through the study of limiting casesof Graph Convolutional Networks as well as systematic randomizations of bothfeatures and graph structure applied to a constructive example and severalexamples of citation networks of different origin. The analysis also revealsthe relative importance of the graph and features for classification purposes.

Working paper

Warren L, Clarke J, Arora S, Barahona M, Arebi N, Darzi Aet al., 2019, Transitions of care across hospital settings in patients with inflammatory bowel disease, World Journal of Gastroenterology, Vol: 25, Pages: 2122-2132, ISSN: 1007-9327

BACKGROUNDInflammatory bowel disease (IBD) is a chronic, inflammatory disorder characterised by both intestinal and extra-intestinal pathology. Patients may receive both emergency and elective care from several providers, often in different hospital settings. Poorly managed transitions of care between providers can lead to inefficiencies in care and patient safety issues. To ensure that the sharing of patient information between providers is appropriate, timely, accurate and secure, effective data-sharing infrastructure needs to be developed. To optimise inter-hospital data-sharing for IBD patients, we need to better understand patterns of hospital encounters in this group.AIMTo determine the type and location of hospital services accessed by IBD patients in England.METHODSThis was a retrospective observational study using Hospital Episode Statistics, a large administrative patient data set from the National Health Service in England. Adult patients with a diagnosis of IBD following admission to hospital were followed over a 2-year period to determine the proportion of care accessed at the same hospital providing their outpatient IBD care, defined as their ‘home provider’. Secondary outcome measures included the geographic distribution of patient-sharing, regional and age-related differences in accessing services, and type and frequency of outpatient encounters.RESULTSOf 95055 patients accessed hospital services on 1760156 occasions over a 2-year follow-up period. The proportion of these encounters with their identified IBD ‘home provider’ was 73.3%, 87.8% and 83.1% for accident and emergency, inpatient and outpatient encounters respectively. Patients living in metropolitan centres and younger patients were less likely to attend their ‘home provider’ for hospital services. The most commonly attended specialty services were gastroenterology, general surgery and ophthalmology.CONCLUSIONTransitions of care between secondary care sett

Journal article

Kuntz J, Thomas P, Stan G-B, Barahona Met al., 2019, The exit time finite state projection scheme: bounding exit distributions and occupation measures of continuous-time Markov chains, SIAM Journal on Scientific Computing, Vol: 41, Pages: A748-A769, ISSN: 1064-8275

We introduce the exit time finite state projection (ETFSP) scheme, a truncation- based method that yields approximations to the exit distribution and occupation measure associated with the time of exit from a domain (i.e., the time of first passage to the complement of the domain) of time-homogeneous continuous-time Markov chains. We prove that: (i) the computed approximations bound the measures from below; (ii) the total variation distances between the approximations and the measures decrease monotonically as states are added to the truncation; and (iii) the scheme converges, in the sense that, as the truncation tends to the entire state space, the total variation distances tend to zero. Furthermore, we give a computable bound on the total variation distance between the exit distribution and its approximation, and we delineate the cases in which the bound is sharp. We also revisit the related finite state projection scheme and give a comprehensive account of its theoretical properties. We demonstrate the use of the ETFSP scheme by applying it to two biological examples: the computation of the first passage time associated with the expression of a gene, and the fixation times of competing species subject to demographic noise.

Journal article

Tonn M, Thomas P, Barahona M, Oyarzun Det al., 2019, Stochastic modelling reveals mechanisms of metabolic heterogeneity, Communications Biology, Vol: 2, ISSN: 2399-3642

Phenotypic variation is a hallmark of cellular physiology. Metabolic heterogeneity, in particular, underpins single-cell phenomena such as microbial drug tolerance and growth variability. Much research has focussed on transcriptomic and proteomic heterogeneity, yet it remains unclear if such variation permeates to the metabolic state of a cell. Here we propose a stochastic model to show that complex forms of metabolic heterogeneity emerge from fluctuations in enzyme expression and catalysis. The analysis predicts clonal populations to split into two or more metabolically distinct subpopulations. We reveal mechanisms not seen in deterministic models, in which enzymes with unimodal expression distributions lead to metabolites with a bimodal or multimodal distribution across the population. Based on published data, the results suggest that metabolite heterogeneity may be more pervasive than previously thought. Our work casts light on links between gene expression and metabolism, and provides a theory to probe the sources of metabolite heterogeneity.

Journal article

Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2019, From free text to clusters of content in health records: An unsupervised graph partitioning approach, Applied Network Science, Vol: 4, ISSN: 2364-8228

Electronic Healthcare records contain large volumes of unstructured data in different forms. Free text constitutes a large portion of such data, yet this source of richly detailed information often remains under-used in practice because of a lack of suitable methodologies to extract interpretable contentin a timely manner. Here we apply network-theoretical tools to the analysis of free text in Hospital Patient Incident reports in the English National Health Service, to find clusters of reports in an unsupervised manner and at different levels of resolution based directly on the free text descriptions contained within them. To do so, we combine recently developed deep neural network text-embedding methodologies based on paragraph vectors with multi-scale Markov Stability community detection applied to a similarity graph of documents obtained from sparsified text vector similarities. We showcase the approach with the analysis of incident reports submitted in Imperial College Healthcare NHS Trust, London. The multiscale community structure reveals levels of meaning with different resolution in the topics of the dataset, as shown by relevant descriptive terms extracted from thegroups of records, as well as by comparing a posteriori against hand-coded categories assigned by healthcare personnel. Our content communities exhibit good correspondence with well-defined hand-coded categories, yet our results also provide further medical detail in certain areas as well asrevealing complementary descriptors of incidents beyond the external classification. We also discuss how the method can be used to monitor reports over time and across different healthcare providers, and to detect emerging trends that fall outside of pre-existing categories.

Journal article

Gosztolai A, Carrillo JA, Barahona M, 2019, Collective search with finite perception: Transient dynamics and search efficiency, Frontiers in Physics, Vol: 6, ISSN: 2296-424X

Motile organisms often use finite spatial perception of their surroundings to navigate and search their habitats. Yet standard models of search are usually based on purely local sensory information. To model how a finite perceptual horizon affects ecological search, we propose a framework for optimal navigation that combines concepts from random walks and optimal control theory. We show that, while local strategies are optimal on asymptotically long and short search times, finite perception yields faster convergence and increased search efficiency over transient time scales relevant in biological systems. The benefit of the finite horizon can be maintained by the searchers tuning their response sensitivity to the length scale of the stimulant in the environment, and is enhanced when the agents interact as a result of increased consensus within subpopulations. Our framework sheds light on the role of spatial perception and transients in search movement and collective sensing of the environment.

Journal article

Clarke JM, Warren LR, Arora S, Barahona M, Darzi AWet al., 2018, Guiding interoperable electronic health records through patient-sharing networks., NPJ digital medicine, Vol: 1, Pages: 65-65, ISSN: 2398-6352

Effective sharing of clinical information between care providers is a critical component of a safe, efficient health system. National data-sharing systems may be costly, politically contentious and do not reflect local patterns of care delivery. This study examines hospital attendances in England from 2013 to 2015 to identify instances of patient sharing between hospitals. Of 19.6 million patients receiving care from 155 hospital care providers, 130 million presentations were identified. On 14.7 million occasions (12%), patients attended a different hospital to the one they attended on their previous interaction. A network of hospitals was constructed based on the frequency of patient sharing between hospitals which was partitioned using the Louvain algorithm into ten distinct data-sharing communities, improving the continuity of data sharing in such instances from 0 to 65-95%. Locally implemented data-sharing communities of hospitals may achieve effective accessibility of clinical information without a large-scale national interoperable information system.

Journal article

O'Clery N, Yuan Y, Stan G-B, Barahona Met al., 2018, Global Network Prediction from Local Node Dynamics

The study of dynamical systems on networks, describing complex interactiveprocesses, provides insight into how network structure affects globalbehaviour. Yet many methods for network dynamics fail to cope with large orpartially-known networks, a ubiquitous situation in real-world applications.Here we propose a localised method, applicable to a broad class of dynamicalmodels on networks, whereby individual nodes monitor and store the evolution oftheir own state and use these values to approximate, via a simple computation,their own steady state solution. Hence the nodes predict their own final statewithout actually reaching it. Furthermore, the localised formulation enablesnodes to compute global network metrics without knowledge of the full networkstructure. The method can be used to compute global rankings in the networkfrom local information; to detect community detection from fast, localtransient dynamics; and to identify key nodes that compute global networkmetrics ahead of others. We illustrate some of the applications of thealgorithm by efficiently performing web-page ranking for a large internetnetwork and identifying the dynamic roles of inter-neurons in the C. Elegansneural network. The mathematical formulation is simple, widely applicable andeasily scalable to real-world datasets suggesting how local computation canprovide an approach to the study of large-scale network dynamics.

Journal article

Beguerisse M, Bosque G, Oyarzun DA, Pico J, Barahona Met al., 2018, Flux-dependent graphs for metabolic networks, npj Systems Biology and Applications, Vol: 4, ISSN: 2056-7189

Cells adapt their metabolic fluxes in response to changes in the environment. We present a framework for the systematic construction of flux-based graphs derived from organism-wide metabolic networks. Our graphs encode the directionality of metabolic flows via edges that represent the flow of metabolites from source to target reactions. The methodology can be applied in the absence of a specific biological context by modelling fluxes probabilistically, or can be tailored to different environmental conditions by incorporating flux distributions computed through constraint-based approaches such as Flux Balance Analysis. We illustrate our approach on the central carbon metabolism of Escherichia coli and on a metabolic model of human hepatocytes. The flux-dependent graphs under various environmental conditions and genetic perturbations exhibit systemic changes in their topological and community structure, which capture the re-routing of metabolic flows and the varying importance of specific reactions and pathways. By integrating constraint-based models and tools from network science, our framework allows the study of context-specific metabolic responses at a system level beyond standard pathway descriptions.

Journal article

Hodges M, Barahona M, Yaliraki SN, 2018, Allostery and cooperativity in multimeric proteins: bond-to-bond propensities in ATCase, SCIENTIFIC REPORTS, Vol: 8, ISSN: 2045-2322

Journal article

Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2018, From Text to Topics in Healthcare Records: An Unsupervised Graph Partitioning Methodology, 2018 KDD Conference Proceedings - MLMH: Machine Learning for Medicine and Healthcare

Electronic Healthcare Records contain large volumes of unstructured data,including extensive free text. Yet this source of detailed information oftenremains under-used because of a lack of methodologies to extract interpretablecontent in a timely manner. Here we apply network-theoretical tools to analysefree text in Hospital Patient Incident reports from the National HealthService, to find clusters of documents with similar content in an unsupervisedmanner at different levels of resolution. We combine deep neural networkparagraph vector text-embedding with multiscale Markov Stability communitydetection applied to a sparsified similarity graph of document vectors, andshowcase the approach on incident reports from Imperial College Healthcare NHSTrust, London. The multiscale community structure reveals different levels ofmeaning in the topics of the dataset, as shown by descriptive terms extractedfrom the clusters of records. We also compare a posteriori against hand-codedcategories assigned by healthcare personnel, and show that our approachoutperforms LDA-based models. Our content clusters exhibit good correspondencewith two levels of hand-coded categories, yet they also provide further medicaldetail in certain areas and reveal complementary descriptors of incidentsbeyond the external classification taxonomy.

Conference paper

Altuncu MT, Yaliraki SN, Barahona M, 2018, Content-driven, unsupervised clustering of news articles through multiscale graph partitioning, KDD 2018 - Workshop on Data Science Journalism and Media (DSJM)

The explosion in the amount of news and journalistic content being generatedacross the globe, coupled with extended and instantaneous access to informationthrough online media, makes it difficult and time-consuming to monitor newsdevelopments and opinion formation in real time. There is an increasing needfor tools that can pre-process, analyse and classify raw text to extractinterpretable content; specifically, identifying topics and content-drivengroupings of articles. We present here such a methodology that brings togetherpowerful vector embeddings from Natural Language Processing with tools fromGraph Theory that exploit diffusive dynamics on graphs to reveal naturalpartitions across scales. Our framework uses a recent deep neural network textanalysis methodology (Doc2vec) to represent text in vector form and thenapplies a multi-scale community detection method (Markov Stability) topartition a similarity graph of document vectors. The method allows us toobtain clusters of documents with similar content, at different levels ofresolution, in an unsupervised manner. We showcase our approach with theanalysis of a corpus of 9,000 news articles published by Vox Media over oneyear. Our results show consistent groupings of documents according to contentwithout a priori assumptions about the number or type of clusters to be found.The multilevel clustering reveals a quasi-hierarchy of topics and subtopicswith increased intelligibility and improved topic coherence as compared toexternal taxonomy services and standard topic detection methods.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00333972&limit=30&person=true