Imperial College London

ProfessorMauricioBarahona

Faculty of Natural SciencesDepartment of Mathematics

Chair in Biomathematics
 
 
 
//

Contact

 

m.barahona Website

 
 
//

Location

 

6M31Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

202 results found

Maes A, Barahona M, Clopath C, 2023, Long- and short-term history effects in a spiking network model of statistical learning, Scientific Reports, Vol: 13, Pages: 1-14, ISSN: 2045-2322

The statistical structure of the environment is often important when making decisions. There are multiple theories of howthe brain represents statistical structure. One such theory states that neural activity spontaneously samples from probabilitydistributions. In other words, the network spends more time in states which encode high-probability stimuli. Starting fromthe neural assembly, increasingly thought of to be the building block for computation in the brain, we focus on how arbitraryprior knowledge about the external world can both be learned and spontaneously recollected. We present a model basedupon learning the inverse of the cumulative distribution function. Learning is entirely unsupervised using biophysical neuronsand biologically plausible learning rules. We show how this prior knowledge can then be accessed to compute expectationsand signal surprise in downstream networks. Sensory history effects emerge from the model as a consequence of ongoinglearning.

Journal article

Lamprinakou S, Barahona M, Flaxman S, Filippi S, Gandy A, McCoy EJet al., 2023, BART-based inference for Poisson processes, Computational Statistics and Data Analysis, Vol: 180, Pages: 1-25, ISSN: 0167-9473

The effectiveness of Bayesian Additive Regression Trees (BART) has been demonstrated in a variety of contexts including non-parametric regression and classification. A BART scheme for estimating the intensity of inhomogeneous Poisson processes is introduced. Poisson intensity estimation is a vital task in various applications including medical imaging, astrophysics and network traffic analysis. The new approach enables full posterior inference of the intensity in a non-parametric regression setting. The performance of the novel scheme is demonstrated through simulation studies on synthetic and real datasets up to five dimensions, and the new scheme is compared with alternative approaches.

Journal article

August E, Barahona M, 2022, Finding positively invariant sets and proving exponential stability of limit cycles using sum-of-squares decompositions, Journal of Computational Dynamics, Vol: 10, Pages: 105-126, ISSN: 2158-2505

The dynamics of many systems from physics, economics, chemistry, and biology can be modelled through polynomial functions. In this paper, we provide a computational means to find positively invariant sets of polynomial dynamical systems by using semidefinite programming to solve sum-of-squares (SOS) programmes. With the emergence of SOS programmes, it is possible to efficiently search for Lyapunov functions that guarantee stability of polynomial systems. Yet, SOS computations often fail to find functions, such that the conditions hold in the entire state space. We show here that restricting the SOS optimisation to specific domains enables us to obtain positively invariant sets, thus facilitating the analysis of the dynamics by considering separately eachpositively invariant set. In addition, we go beyond classical Lyapunov stability analysis and use SOS decompositions to computationally implement sufficient positivity conditions that guarantee existence, uniqueness, and exponential stability of a limit cycle. Importantly, this approach is applicable to systems of any dimension and, thus, goes beyond classical methods that are restricted to two dimensional phase space. We illustrate our different results with applications to classical systems, such as the van der Pol oscillator, the Fitzhugh-Nagumo neuronal equation, and the Lorenz system.

Journal article

Sapienza R, Barahona M, Saxena D, alexis A, Yaliraki Set al., 2022, Sensitivity and spectral control of network lasers, Nature Communications, Vol: 13, Pages: 1-7, ISSN: 2041-1723

Recently, random lasing in complex networks has shown efficient lasing over more than 50 localised modes, promoted by multiple scattering over the underlying graph. If controlled, these network lasers can lead to fast-switching multifunctional light sources with synthesised spectrum. Here, we observe both in experiment and theory high sensitivity of the network laser spectrum to the spatial shape of the pump profile, with some modes for example increasing in intensity by 280% when switching off 7% of the pump beam. We solve the nonlinear equations within the steady state ab-initio laser theory (SALT) approximation over a graph and we show selective lasing of around 90% of the strongest intensity modes, effectively programming the spectrum of the lasing networks. In our experiments with polymer networks, this high sensitivity enables control of the lasing spectrum through non-uniform pump patterns. We propose the underlying complexity of the network modes as the key element behind efficient spectral control opening the way for the development of optical devices with wide impact for on-chip photonics for communication, sensing, and computation.

Journal article

Strömich L, Wu N, Barahona M, Yaliraki SNet al., 2022, Allosteric Hotspots in the Main Protease of SARS-CoV-2., J Mol Biol, Vol: 434

Inhibiting the main protease of SARS-CoV-2 is of great interest in tackling the COVID-19 pandemic caused by the virus. Most efforts have been centred on inhibiting the binding site of the enzyme. However, considering allosteric sites, distant from the active or orthosteric site, broadens the search space for drug candidates and confers the advantages of allosteric drug targeting. Here, we report the allosteric communication pathways in the main protease dimer by using two novel fully atomistic graph-theoretical methods: Bond-to-bond propensity, which has been previously successful in identifying allosteric sites in extensive benchmark data sets without a priori knowledge, and Markov transient analysis, which has previously aided in finding novel drug targets in catalytic protein families. Using statistical bootstrapping, we score the highest ranking sites against random sites at similar distances, and we identify four statistically significant putative allosteric sites as good candidates for alternative drug targeting.

Journal article

Wu N, Yaliraki SN, Barahona M, 2022, Prediction of Protein Allosteric Signalling Pathways and Functional Residues Through Paths of Optimised Propensity, Journal of Molecular Biology, Vol: 434, Pages: 167749-167749, ISSN: 0022-2836

Journal article

Freischem LJ, Barahona M, Oyarzún DA, 2022, Prediction of gene essentiality using machine learning and genome-scale metabolic models, 9th IFAC Conference on Foundations of Systems Biology in Engineering FOSBE 2022, Publisher: Elsevier BV, Pages: 13-18, ISSN: 2405-8963

The identification of essential genes, i.e. those that impair cell survival when deleted, requires large growth assays of knock-out strains. The complexity and cost of such experiments has triggered a growing interest in computational methods for prediction of gene essentiality. In the case of metabolic genes, Flux Balance Analysis (FBA) is widely employed to predict essentiality under the assumption that cells maximize their growth rate. However, this approach assumes that knock-out strains optimize the same objectives as the wild-type, which excludes cases in which deletions cause large physiological changes to meet other objectives for survival. Here, we resolve this limitation with a novel machine learning approach that predicts essentiality directly from wild-type flux distributions. We first project the wild-type FBA solution onto a mass flow graph, a digraph with reactions as nodes and edge weights proportional to the mass transfer between reactions, and then train binary classifiers on the connectivity of graph nodes. We demonstrate the efficacy of this approach using the most complete metabolic model of Escherichia coli, achieving near state-of-the art prediction accuracy for essential genes. Our approach suggests that wild-type FBA solutions contain enough information to predict essentiality, without the need to assume optimality of deletion strains.

Conference paper

Alaa A, Mayer E, Barahona M, 2022, ICE-NODE: Integration of Clinical Embeddings with Neural Ordinary Differential Equations, Machine Learning for Healthcare Conference, Pages: 537-564, ISSN: 2640-3498

Early diagnosis of disease can lead to improved health outcomes, including higher survival rates and lower treatment costs. With the massive amount of information available in electronic health records (EHRs), there is great potential to use machine learning (ML) methods to model disease progression aimed at early prediction of disease onset and other outcomes. In this work, we employ recent innovations in neural ODEs combined with rich semantic embeddings of clinical codes to harness the full temporal information of EHRs. We propose ICE-NODE (Integration of Clinical Embeddings with Neural Ordinary Differential Equations), an architecture that temporally integrates embeddings of clinical codes and neural ODEs to learn and predict patient trajectories in EHRs. We apply our method to the publicly available MIMIC-III and MIMIC-IV datasets, and we find improved prediction results compared to state-of-the-art methods, specifically for clinical codes that are not frequently observed in EHRs. We also show that ICE-NODE is more competent at predicting certain medical conditions, like acute renal failure, pulmonary heart disease and birth-related problems, where the full temporal information could provide important information. Furthermore, ICE-NODE is also able to produce patient risk trajectories over time that can be exploited for further detailed predictions of disease evolution.

Conference paper

Myall A, Price J, Peach R, Abbas M, Mookerjee S, Zhu N, Ahmad I, Ming D, Ramzan F, Teixeira D, Graf C, Weisse A, Harbarth S, Holmes A, Barahona Met al., 2022, Predicting hospital-onset COVID-19 infections using dynamic networks of patient contact: an international retrospective cohort study, The Lancet Digital Health, Vol: 4, Pages: e573-e583, ISSN: 2589-7500

Background:Real-time prediction is key to prevention and control of infections associated with health-care settings. Contacts enable spread of many infections, yet most risk prediction frameworks fail to account for their dynamics. We developed, tested, and internationally validated a real-time machine-learning framework, incorporating dynamic patient-contact networks to predict hospital-onset COVID-19 infections (HOCIs) at the individual level.Methods:We report an international retrospective cohort study of our framework, which extracted patient-contact networks from routine hospital data and combined network-derived variables with clinical and contextual information to predict individual infection risk. We trained and tested the framework on HOCIs using the data from 51 157 hospital inpatients admitted to a UK National Health Service hospital group (Imperial College Healthcare NHS Trust) between April 1, 2020, and April 1, 2021, intersecting the first two COVID-19 surges. We validated the framework using data from a Swiss hospital group (Department of Rehabilitation, Geneva University Hospitals) during a COVID-19 surge (from March 1 to May 31, 2020; 40 057 inpatients) and from the same UK group after COVID-19 surges (from April 2 to Aug 13, 2021; 43 375 inpatients). All inpatients with a bed allocation during the study periods were included in the computation of network-derived and contextual variables. In predicting patient-level HOCI risk, only inpatients spending 3 or more days in hospital during the study period were examined for HOCI acquisition risk.Findings:The framework was highly predictive across test data with all variable types (area under the curve [AUC]-receiver operating characteristic curve [ROC] 0·89 [95% CI 0·88–0·90]) and similarly predictive using only contact-network variables (0·88 [0·86–0·90]). Prediction was reduced when using only hospital contextual (AUC-ROC 0·82 [95% CI 0&middo

Journal article

Rodrigues D, Kreif N, Lawrence-Jones A, Barahona M, Mayer Eet al., 2022, Reflection on modern methods: constructing directed acyclic graphs (DAGs) with domain experts for health services research, International Journal of Epidemiology, Vol: 51, ISSN: 0300-5771

Directed acyclic graphs (DAGs) are a useful tool to represent, in a graphical format, researchers’ assumptions about the causal structure among variables while providing a rationale for the choice of confounding variables to adjust for. With origins in the field of probabilistic graphical modelling, DAGs are yet to be widely adopted in applied health research, where causal assumptions are frequently made for the purpose of evaluating health services initiatives. In this context, there is still limited practical guidance on how to construct and use DAGs. Some progress has recently been made in terms of building DAGs based on studies from the literature, but an area that has received less attention is how to create DAGs from information provided by domain experts, an approach of particular importance when there is limited published information about the intervention under study. This approach offers the opportunity for findings to be more robust and relevant to patients, carers and the public, and more likely to inform policy and clinical practice. This article draws lessons from a stakeholder workshop involving patients, health care professionals, researchers, commissioners and representatives from industry, whose objective was to draw DAGs for a complex intervention—online consultation, i.e. written exchange between the patient and health care professional using an online system—in the context of the English National Health Service. We provide some initial, practical guidance to those interested in engaging with domain experts to develop DAGs.

Journal article

Peach R, Arnaudon A, Barahona M, 2022, Relative, local and global dimension in complex networks., Nat Commun, Vol: 13

Dimension is a fundamental property of objects and the space in which they are embedded. Yet ideal notions of dimension, as in Euclidean spaces, do not always translate to physical spaces, which can be constrained by boundaries and distorted by inhomogeneities, or to intrinsically discrete systems such as networks. To take into account locality, finiteness and discreteness, dynamical processes can be used to probe the space geometry and define its dimension. Here we show that each point in space can be assigned a relative dimension with respect to the source of a diffusive process, a concept that provides a scale-dependent definition for local and global dimension also applicable to networks. To showcase its application to physical systems, we demonstrate that the local dimension of structural protein graphs correlates with structural flexibility, and the relative dimension with respect to the active site uncovers regions involved in allosteric communication. In simple models of epidemics on networks, the relative dimension is predictive of the spreading capability of nodes, and identifies scales at which the graph structure is predictive of infectivity. We further apply our dimension measures to neuronal networks, economic trade, social networks, ocean flows, and to the comparison of random graphs.

Journal article

Rodrigues D, Kreif N, Saravanakumar K, Delaney B, Barahona M, Mayer Eet al., 2022, Formalising triage in general practice towards a more equitable, safe, and efficient allocation of resources, BMJ: British Medical Journal, Vol: 377, ISSN: 0959-535X

Journal article

Sivan M, Greenhalgh T, Darbyshire JL, Mir G, O'Connor RJ, Dawes H, Greenwood D, O'Connor D, Horton M, Petrou S, de Lusignan S, Curcin V, Mayer E, Casson A, Milne R, Rayner C, Smith N, Parkin A, Preston N, Delaney Bet al., 2022, LOng COvid Multidisciplinary consortium Optimising Treatments and services acrOss the NHS (LOCOMOTION): protocol for a mixed-methods study in the UK, BMJ OPEN, Vol: 12, ISSN: 2044-6055

Journal article

Chrysostomou S, Roy R, Prischi F, Thamlikitkul L, Chapman KL, Mufti U, Peach R, Ding L, Hancock D, Moore C, Molina-Arcas M, Mauri F, Pinato DJ, Abrahams JM, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Pirrie S, Gaunt C, Billingham L, Steven NM, Cullen M, Hrouda D, Winkler M, Post J, Cohen P, Salpeter SJ, Bar V, Zundelevich A, Golan S, Leibovici D, Lara R, Klug DR, Yaliraki SN, Barahona M, Wang Y, Downward J, Skehel JM, Ali MMU, Seckl MJ, Pardo Eet al., 2022, Re: Repurposed Floxacins Targeting RSK4 Prevent Chemoresistance and Metastasis in Lung and Bladder Cancer, JOURNAL OF UROLOGY, Vol: 207, Pages: 919-920, ISSN: 0022-5347

Journal article

Qian Y, Expert P, Rieu T, Panzarasa P, Barahona Met al., 2022, Quantifying the alignment of graph and features in deep learning, IEEE Transactions on Neural Networks and Learning Systems, Vol: 33, Pages: 1663-1672, ISSN: 1045-9227

We show that the classification performance of graph convolutional networks (GCNs) is related to the alignment between features, graph, and ground truth, which we quantify using a subspace alignment measure (SAM) corresponding to the Frobenius norm of the matrix of pairwise chordal distances between three subspaces associated with features, graph, and ground truth. The proposed measure is based on the principal angles between subspaces and has both spectral and geometrical interpretations. We showcase the relationship between the SAM and the classification performance through the study of limiting cases of GCNs and systematic randomizations of both features and graph structure applied to a constructive example and several examples of citation networks of different origins. The analysis also reveals the relative importance of the graph and features for classification purposes.

Journal article

Laumann F, von Kuegelgen J, Kanashiro Uehara TH, Barahona Met al., 2022, Quantitative assessment of complex interlinkages, key objectives and nexuses amongst the Sustainable Development Goals and climate change, The Lancet Planetary Health, Vol: 6, ISSN: 2542-5196

Background. Global sustainability is an enmeshed system of complex socio-economic, climato-logical and ecological interactions. The numerous objectives of the United Nations’ Sustainable Development Goals (SDGs) and the Paris Agreement have various levels of interdependence, making it difficult to ascertain the influence of changes in particular indicators across the whole system.Methods. We present a method to find interlinkages amongst the 17 SDGs and climate change, including non-linear and non-monotonic dependences, by using 400 indicators that track their temporal changes. The method detects statistically significant dependencies amongst the time evolution of the objectives by using partial distance correlations, a non-linear measure of conditional dependence that also discounts spurious correlations originating from lurking variables. We then employ a network representation to identify the most important objectives (using network centrality) and to obtain nexuses of objectives (defined as highly interconnected clusters in the network).Findings. Using temporal data from 181 countries spanning 20 years, we analyse dependencies amongst SDGs and climate for 35 country groupings based on region, development and income 2 level. Our results show that the significant interlinkages, central objectives, and nexuses identified vary greatly across country groupings, yet partnerships for the goals (SDG 17) and climate change rank as highly important across many country groupings.Temperature rise is strongly linked to urbanisation, air pollution, and slum expansion (SDG 11), especially in country groupings likely to be worst affectedby climate breakdown such as Africa. In several groupings encompassing the developing countries, a consistent nexus of strongly interconnected objectives is formed by poverty reduction (SDG 1), education (SDG 4), and economic growth (SDG 8), sometimes incorporating gender equality (SDG 5), and peace and justice (SDG 16).Interpretation. The

Journal article

Myall A, Price J, Peach R, Abbas M, Mookerjee S, Ahmad I, Ming D, Zhu NJ, Ramzan F, Weisse A, Holmes AH, Barahona Met al., 2022, Prediction of hospital-onset COVID-19 using networks of patient contact: an observational study, IMED conference, Publisher: ELSEVIER SCI LTD, Pages: S109-S110, ISSN: 1201-9712

Conference paper

Myall A, Peach R, Wan Y, Mookerjee S, Jauneikaite E, Bolt F, Price J, Davies F, Weisse A, Holmes AH, Barahona Met al., 2022, Improved contact tracing using network analysis and spatial-temporal proximity, iMED conference, Publisher: Elsevier, Pages: S20-S20, ISSN: 1201-9712

Conference paper

Jha S, Mayer E, Barahona M, 2022, Improving information fusion on multimodal clinical data in classification settings, Pages: 154-159

Clinical data often exists in different forms across the lifetime of a patient's interaction with the healthcare system-structured, unstructured or semi-structured data in the form of laboratory readings, clinical notes, diagnostic codes, imaging and audio data of various kinds, and other observational data. Formulating a representation model that aggregates information from these heterogeneous sources may allow us to jointly model on data with more predictive signal than noise and help inform our model with useful constraints learned from better data. Multimodal fusion approaches help produce representations combined from heterogeneous modalities, which can be used for clinical prediction tasks. Representations produced through different fusion techniques require different training strategies. We investigate the advantage of adding narrative clinical text to structured modalities to classification tasks in the clinical domain. We show that while there is a competitive advantage in combined representations of clinical data, the approach can be helped by training guidance customized to each modality. We show empirical results across binary/multiclass settings, single/multitask settings and unified/multimodal learning rate settings for early and late information fusion of clinical data.

Conference paper

Beaney T, Clarke J, Woodcock T, McCarthy R, Saravanakumar K, Barahona M, Blair M, Hargreaves Det al., 2021, Patterns of healthcare utilisation in children and young people: a retrospective cohort study using routinely collected healthcare data in Northwest London, BMJ Open, Vol: 11, Pages: 1-14, ISSN: 2044-6055

ObjectivesWith a growing role for health services in managing population health, there is a need for early identification of populations with high need. Segmentation approaches partition the population based on demographics, long-term conditions (LTCs) or healthcare utilisation but have mostly been applied to adults. Our study uses segmentation methods to distinguish patterns of healthcare utilisation in children and young people (CYP) and to explore predictors of segment membership.DesignRetrospective cohort study.SettingRoutinely collected primary and secondary healthcare data in Northwest London from the Discover database.Participants378,309 CYP aged 0-15 years registered to a general practice in Northwest London with one full year of follow-up.Primary and secondary outcome measuresAssignment of each participant to a segment defined by seven healthcare variables representing primary and secondary care attendances, and description of utilisation patterns by segment. Predictors of segment membership described by age, sex, ethnicity, deprivation and LTCs.ResultsParticipants were grouped into six segments based on healthcare utilisation. Three segments predominantly used primary care; two moderate utilisation segments differed in use of emergency or elective care, and a high utilisation segment, representing 16,632 (4.4%) children accounted for the highest mean presentations across all service types. The two smallest segments, representing 13.3% of the population, accounted for 62.5% of total costs. Younger age, residence in areas of higher deprivation, and presence of one or more LTCs were associated with membership of higher utilisation segments, but 75.0% of those in the highest utilisation segment had no LTC.ConclusionsThis article identifies six segments of healthcare utilisation in CYP and predictors of segment membership. Demographics and LTCs may not explain utilisation patterns as strongly as in adults which may limit the use of routine data in predicting ut

Journal article

Liu Z, Peach R, Lawrance E, Noble A, Ungless M, Barahona Met al., 2021, Listening to mental health crisis needs at scale: using Natural Language Processing to understand and evaluate a mental health crisis text messaging service, Frontiers in Digital Health, Vol: 3, Pages: 1-14, ISSN: 2673-253X

The current mental health crisis is a growing public health issue requiring a large-scale response that cannot be met with traditional services alone. Digital support tools are proliferating, yet most are not systematically evaluated, and we know little about their users and their needs. Shout is a free mental health text messaging service run by the charity Mental Health Innovations, which provides support for individuals in the UK experiencing mental or emotional distress and seeking help. Here we study a large data set of anonymised text message conversations and post-conversation surveys compiled through Shout. This data provides an opportunity to hear at scale from those experiencing distress; to better understand mental health needs for people not using traditional mental health services; and to evaluate the impact of a novel form of crisis support. We use natural language processing (NLP) to assess the adherence of volunteers to conversation techniques and formats, and to gain insight into demographic user groups and their behavioural expressions of distress. Our textual analyses achieve accurate classification of conversation stages (weighted accuracy = 88%), behaviours (1-hamming loss = 95%) and texter demographics (weighted accuracy = 96%), exemplifying how the application of NLP to frontline mental health data sets can aid with post-hoc analysis and evaluation of quality of service provision in digital mental health services.

Journal article

Liu Z, Barahona M, 2021, Similarity measure for sparse time course data based on Gaussian processes, Uncertainty in Artificial Intelligence 2021, Publisher: PMLR, Pages: 1332-1341

We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.

Conference paper

Ming DK, Myall AC, Hernandez B, Weiße AY, Peach RL, Barahona M, Rawson TM, Holmes AHet al., 2021, Informing antimicrobial management in the context of COVID-19: understanding the longitudinal dynamics of C-reactive protein and procalcitonin, BMC Infectious Diseases, Vol: 21

Background: To characterise the longitudinal dynamics of C-reactive protein (CRP) and Procalcitonin (PCT) in a cohort of hospitalised patients with COVID-19 and support antimicrobial decision-making. Methods: Longitudinal CRP and PCT concentrations and trajectories of 237 hospitalised patients with COVID-19 were modelled. The dataset comprised of 2,021 data points for CRP and 284 points for PCT. Pairwise comparisons were performed between: (i) those with or without significant bacterial growth from cultures, and (ii) those who survived or died in hospital. Results: CRP concentrations were higher over time in COVID-19 patients with positive microbiology (day 9: 236 vs 123 mg/L, p < 0.0001) and in those who died (day 8: 226 vs 152 mg/L, p < 0.0001) but only after day 7 of COVID-related symptom onset. Failure for CRP to reduce in the first week of hospital admission was associated with significantly higher odds of death. PCT concentrations were higher in patients with COVID-19 and positive microbiology or in those who died, although these differences were not statistically significant. Conclusions: Both the absolute CRP concentration and the trajectory during the first week of hospital admission are important factors predicting microbiology culture positivity and outcome in patients hospitalised with COVID-19. Further work is needed to describe the role of PCT for co-infection. Understanding relationships of these biomarkers can support development of risk models and inform optimal antimicrobial strategies.

Journal article

Wan Y, Myall AC, Boonyasiri A, Bolt F, Ledda A, Mookerjee S, Weiße AY, Getino M, Turton JF, Abbas H, Prakapaite R, Sabnis A, Abdolrasoulia A, Malpartida-Cardenas K, Miglietta L, Donaldson H, Gilchrist M, Hopkins KL, Ellington MJ, Otter JA, Larrouy-Maumus G, Edwards AM, Rodriguez-Manzano J, Didelot X, Barahona M, Holmes AH, Jauneikaite E, Davies Fet al., 2021, Integrated analysis of patient networks and plasmid genomes reveals a regional, multi-species outbreak of carbapenemase-producing Enterobacterales carrying both<i>bla</i><sub>IMP</sub>and<i>mcr-9</i>genes

<jats:title>Abstract</jats:title><jats:sec><jats:title>Background</jats:title><jats:p>Carbapenemase-producing Enterobacterales (CPE) are challenging in the healthcare setting, with resistance to multiple classes of antibiotics and a high associated mortality. The incidence of CPE is rising globally, despite enhanced awareness and control efforts. This study describes an investigation of the emergence of IMP-encoding CPE amongst diverse Enterobacterales species between 2016 and 2019 in patients across a London regional hospital network.</jats:p></jats:sec><jats:sec><jats:title>Methods</jats:title><jats:p>We carried out a network analysis of patient pathways, using electronic health records, to identify contacts between IMP-encoding CPE positive patients. Genomes of IMP-encoding CPE isolates were analysed and overlayed with patient contacts to imply potential transmission events.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>Genomic analysis of 84 Enterobacterales isolates revealed diverse species (predominantly<jats:italic>Klebsiella</jats:italic>spp,<jats:italic>Enterobacter</jats:italic>spp,<jats:italic>E. coli</jats:italic>), of which 86% (72/84) harboured an IncHI2 plasmid, which carried both<jats:italic>bla</jats:italic><jats:sub>IMP</jats:sub>and the mobile colistin resistance gene<jats:italic>mcr-9</jats:italic>(68/72). Phylogenetic analysis of IncHI2 plasmids identified three lineages which showed significant association with patient contact and movements between four hospital sites and across medical specialities, which had been missed on initial investigations.</jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p>Combined, our patient network and plasmid analyses demonstrate an interspecies, plasmid-med

Journal article

Wong T, Barahona M, 2021, Dimensionality reduction and data integration for scRNA-seq data based on integrative hierarchical Poisson factorisation

<jats:title>Abstract</jats:title><jats:p>Single-cell RNA sequencing (scRNA-seq) data sets consist of high-dimensional, sparse and noisy feature vectors, and pose a challenge for classic methods for dimensionality reduction. Such problems are compounded when dealing with composite data sets formed by different batches. We introduce Integrative Hierarchical Poisson Factorisation (IHPF), an extension of HPF that makes use of a noise ratio hyper-parameter to tune the variability attributed to batches <jats:italic>vs</jats:italic>. biological sources (cell phenotypes). We exemplify the application of IHPF under different data integration scenarios with varying alignments of batches and cell diversity, and show that IHPF produces latent factors that can be advantageously applied for cell clustering and visualisation. In addition, the extracted factors have a dual block structure in both cell and gene spaces with enhanced biological interpretability.</jats:p>

Journal article

Mersmann S, Stromich L, Song F, Wu N, Vianello F, Barahona M, Yaliraki Set al., 2021, ProteinLens: a web-based application for the analysis of allosteric signalling on atomistic graphs of biomolecules, Nucleic Acids Research, Vol: 49, Pages: W551-W558, ISSN: 0305-1048

The investigation of allosteric effects in biomolecular structures is of great current interest in diverse areas, from fundamental biological enquiry to drug discovery. Here we present ProteinLens, a user-friendly and interactive web application for the investigation of allosteric signalling based on atomistic graph-theoretical methods. Starting from the PDB file of a biomolecule (or a biomolecular complex) ProteinLens obtains an atomistic, energy-weighted graph description of the structure of the biomolecule, and subsequently provides a systematic analysis of allosteric signalling and communication across the structure using two computationally efficient methods: Markov Transients and bond-to-bond propensities. ProteinLens scores and ranks every bond and residue according to the speed and magnitude of the propagation of fluctuations emanating from any site of choice (e.g. the active site). The results are presented through statistical quantile scores visualised with interactive plots and adjustable 3D structure viewers, which can also be downloaded. ProteinLens thus allows the investigation of signalling in biomolecular structures of interest to aid the detection of allosteric sites and pathways. ProteinLens is implemented in Python/SQL and freely available to use at: www.proteinlens.io.

Journal article

Chrysostomou S, Roy R, Prischi F, Thamlikitkul L, Chapman KL, Mufti U, Peach R, Ding L, Hancock D, Moore C, Molina-Arcas M, Mauri F, Pinato DJ, Abrahams JM, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Pirrie S, Gaunt C, Billingham L, Steven NM, Cullen M, Hrouda D, Winkler M, Post J, Cohen P, Salpeter SJ, Bar V, Zundelevich A, Golan S, Leibovici D, Lara R, Klug DR, Yaliraki SN, Barahona M, Wang Y, Downward J, Skehel JM, Ali MMU, Seckl MJ, Pardo OEet al., 2021, Repurposed floxacins targeting RSK4 prevent chemoresistance and metastasis in lung and bladder cancer., Science translational medicine, Vol: 13, ISSN: 1946-6234

Lung and bladder cancers are mostly incurable because of the early development of drug resistance and metastatic dissemination. Hence, improved therapies that tackle these two processes are urgently needed to improve clinical outcome. We have identified RSK4 as a promoter of drug resistance and metastasis in lung and bladder cancer cells. Silencing this kinase, through either RNA interference or CRISPR, sensitized tumor cells to chemotherapy and hindered metastasis in vitro and in vivo in a tail vein injection model. Drug screening revealed several floxacin antibiotics as potent RSK4 activation inhibitors, and trovafloxacin reproduced all effects of RSK4 silencing in vitro and in/ex vivo using lung cancer xenograft and genetically engineered mouse models and bladder tumor explants. Through x-ray structure determination and Markov transient and Deuterium exchange analyses, we identified the allosteric binding site and revealed how this compound blocks RSK4 kinase activation through binding to an allosteric site and mimicking a kinase autoinhibitory mechanism involving the RSK4's hydrophobic motif. Last, we show that patients undergoing chemotherapy and adhering to prophylactic levofloxacin in the large placebo-controlled randomized phase 3 SIGNIFICANT trial had significantly increased (<i>P</i> = 0.048) long-term overall survival times. Hence, we suggest that RSK4 inhibition may represent an effective therapeutic strategy for treating lung and bladder cancer.

Journal article

Laumann F, von Kuegelgen J, Barahona M, 2021, Kernel two-sample and independence tests for non-stationary random processes, ITISE 2021 (7th International conference on Time Series and Forecasting), Publisher: https://www.mdpi.com/2673-4591/5/1/31, Pages: 1-13

Two-sample and independence tests with the kernel-based MMD and HSIC haveshown remarkable results on i.i.d. data and stationary random processes.However, these statistics are not directly applicable to non-stationary randomprocesses, a prevalent form of data in many scientific disciplines. In thiswork, we extend the application of MMD and HSIC to non-stationary settings byassuming access to independent realisations of the underlying random process.These realisations - in the form of non-stationary time-series measured on thesame temporal grid - can then be viewed as i.i.d. samples from a multivariateprobability distribution, to which MMD and HSIC can be applied. We further showhow to choose suitable kernels over these high-dimensional spaces by maximisingthe estimated test power with respect to the kernel hyper-parameters. Inexperiments on synthetic data, we demonstrate superior performance of ourproposed approaches in terms of test power when compared to currentstate-of-the-art functional or multivariate two-sample and independence tests.Finally, we employ our methods on a real socio-economic dataset as an exampleapplication.

Conference paper

Clarke JM, Beaney T, Majeed A, Darzi A, Barahona Met al., 2021, Defining Integrated Care Systems Through Patient Data From Referral Networks in the English National Health Service: A Graph-Based Clustering Study.

<jats:title>Abstract</jats:title> <jats:p><jats:bold>Background </jats:bold>Integrated Care Systems (ICSs) are being introduced into the National Health Service (NHS) in England to replace Sustainability and Transformation Partnerships (STPs). They aim to improve care through place-based collaboration between primary, secondary and community providers. It is important that new organisational configurations adequately reflect existing patterns of patient care to minimise disruption resulting from patients crossing between ICSs. <jats:bold> </jats:bold><jats:bold>Methods </jats:bold>All planned outpatient hospital clinic appointments from 1st April 2017 to 31st March 2018 for patients resident in England to NHS hospitals in England were identified from Hospital Episode Statistics. Markov Multiscale Community Detection (MMCD), an unsupervised network clustering technique, was used to identify natural communities of GP practices, hospitals and geographic regions according to patterns of GP practice registration and outpatient clinic referral. Two primary measures of care coverage were calculated; the proportion of patients registered to a GP practice in a different community than they reside, and the proportion of outpatient clinic appointments to hospitals in a different community to the referring GP practice. <jats:bold> </jats:bold><jats:bold>Results </jats:bold>109,830,647 outpatient clinic appointments were identified for 20,992,695 patients. A configuration of 42 ICSs was identified from MMCD to match the 42 STPs of the current configuration. In the current STP configuration, 534,946 patients (2.6%) were registered to a GP practice in a different STP than their residence, compared to 334,192 (1.6%) in the optimal MMCD configuration. 16,110,267 hospital clinic appointments (14.7%) occurred in a different STP to the referring GP practice, compared to 11,518,735 (10.5%) in the MMCD c

Journal article

Myall AC, Peach RL, Weiße AY, Davies F, Mookerjee S, Holmes A, Barahona Met al., 2021, Network memory in the movement of hospital patients carrying drug-resistant bacteria, Applied Network Science, Vol: 6, ISSN: 2364-8228

Hospitals constitute highly interconnected systems that bring into contact anabundance of infectious pathogens and susceptible individuals, thus makinginfection outbreaks both common and challenging. In recent years, there hasbeen a sharp incidence of antimicrobial-resistance amongsthealthcare-associated infections, a situation now considered endemic in manycountries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three largeLondon hospitals. We show that there are substantial memory effects in themovement of hospital patients colonised with drug-resistant bacteria. Suchmemory effects break first-order Markovian transitive assumptions andsubstantially alter the conclusions from the analysis, specifically on noderankings and the evolution of diffusive processes. We capture variable lengthmemory effects by constructing a lumped-state memory network, which we then useto identify overlapping communities of wards. We find that these communities ofwards display a quasi-hierarchical structure at different levels of granularitywhich is consistent with different aspects of patient flows related to hospitallocations and medical specialties.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00333972&limit=30&person=true