201 results found
Strömich L, Wu N, Barahona M, et al., 2022, Allosteric Hotspots in the Main Protease of SARS-CoV-2., J Mol Biol, Vol: 434
Inhibiting the main protease of SARS-CoV-2 is of great interest in tackling the COVID-19 pandemic caused by the virus. Most efforts have been centred on inhibiting the binding site of the enzyme. However, considering allosteric sites, distant from the active or orthosteric site, broadens the search space for drug candidates and confers the advantages of allosteric drug targeting. Here, we report the allosteric communication pathways in the main protease dimer by using two novel fully atomistic graph-theoretical methods: Bond-to-bond propensity, which has been previously successful in identifying allosteric sites in extensive benchmark data sets without a priori knowledge, and Markov transient analysis, which has previously aided in finding novel drug targets in catalytic protein families. Using statistical bootstrapping, we score the highest ranking sites against random sites at similar distances, and we identify four statistically significant putative allosteric sites as good candidates for alternative drug targeting.
Wu N, Yaliraki SN, Barahona M, 2022, Prediction of Protein Allosteric Signalling Pathways and Functional Residues Through Paths of Optimised Propensity, Journal of Molecular Biology, Vol: 434, Pages: 167749-167749, ISSN: 0022-2836
Myall A, Price JR, Peach RL, et al., 2022, Prediction of hospital-onset COVID-19 infections using dynamic networks of patient contact: an international retrospective cohort study., Lancet Digit Health, Vol: 4, Pages: e573-e583
BACKGROUND: Real-time prediction is key to prevention and control of infections associated with health-care settings. Contacts enable spread of many infections, yet most risk prediction frameworks fail to account for their dynamics. We developed, tested, and internationally validated a real-time machine-learning framework, incorporating dynamic patient-contact networks to predict hospital-onset COVID-19 infections (HOCIs) at the individual level. METHODS: We report an international retrospective cohort study of our framework, which extracted patient-contact networks from routine hospital data and combined network-derived variables with clinical and contextual information to predict individual infection risk. We trained and tested the framework on HOCIs using the data from 51 157 hospital inpatients admitted to a UK National Health Service hospital group (Imperial College Healthcare NHS Trust) between April 1, 2020, and April 1, 2021, intersecting the first two COVID-19 surges. We validated the framework using data from a Swiss hospital group (Department of Rehabilitation, Geneva University Hospitals) during a COVID-19 surge (from March 1 to May 31, 2020; 40 057 inpatients) and from the same UK group after COVID-19 surges (from April 2 to Aug 13, 2021; 43 375 inpatients). All inpatients with a bed allocation during the study periods were included in the computation of network-derived and contextual variables. In predicting patient-level HOCI risk, only inpatients spending 3 or more days in hospital during the study period were examined for HOCI acquisition risk. FINDINGS: The framework was highly predictive across test data with all variable types (area under the curve [AUC]-receiver operating characteristic curve [ROC] 0·89 [95% CI 0·88-0·90]) and similarly predictive using only contact-network variables (0·88 [0·86-0·90]). Prediction was reduced when using only hospital contextual (AUC-ROC 0·82 [95% CI 0·80-0&
August E, Barahona M, 2022, Finding positively invariant sets and proving exponential stability of limit cycles using sum-of-squares decompositions, Journal of Computational Dynamics, ISSN: 2158-2505
The dynamics of many systems from physics, economics, chemistry, and biology can be modelled through polynomial functions. In this paper, we provide a computational means to find positively invariant sets of polynomial dynamical systems by using semidefinite programming to solve sum-of-squares (SOS) programmes. With the emergence of SOS programmes, it is possible to efficiently search for Lyapunov functions that guarantee stability of polynomial systems. Yet, SOS computations often fail to find functions, such that the conditions hold in the entire state space. We show here that restricting the SOS optimisation to specific domains enables us to obtain positively invariant sets, thus facilitating the analysis of the dynamics by considering separately eachpositively invariant set. In addition, we go beyond classical Lyapunov stability analysis and use SOS decompositions to computationally implement sufficient positivity conditions that guarantee existence, uniqueness, and exponential stability of a limit cycle. Importantly, this approach is applicable to systems of any dimension and, thus, goes beyond classical methods that are restricted to two dimensional phase space. We illustrate our different results with applications to classical systems, such as the van der Pol oscillator, the Fitzhugh-Nagumo neuronal equation, and the Lorenz system.
Alaa A, Mayer E, Barahona M, 2022, ICE-NODE: Integration of Clinical Embeddings with Neural Ordinary Differential Equations
Early diagnosis of disease can lead to improved health outcomes, includinghigher survival rates and lower treatment costs. With the massive amount ofinformation available in electronic health records (EHRs), there is greatpotential to use machine learning (ML) methods to model disease progressionaimed at early prediction of disease onset and other outcomes. In this work, weemploy recent innovations in neural ODEs combined with rich semantic embeddingsof clinical codes to harness the full temporal information of EHRs. We proposeICE-NODE (Integration of Clinical Embeddings with Neural Ordinary DifferentialEquations), an architecture that temporally integrates embeddings of clinicalcodes and neural ODEs to learn and predict patient trajectories in EHRs. Weapply our method to the publicly available MIMIC-III and MIMIC-IV datasets, andwe find improved prediction results compared to state-of-the-art methods,specifically for clinical codes that are not frequently observed in EHRs. Wealso show that ICE-NODE is more competent at predicting certain medicalconditions, like acute renal failure, pulmonary heart disease and birth-relatedproblems, where the full temporal information could provide importantinformation. Furthermore, ICE-NODE is also able to produce patient risktrajectories over time that can be exploited for further detailed predictionsof disease evolution.
Rodrigues D, Kreif N, Lawrence-Jones A, et al., 2022, Reflection on modern methods: constructing directed acyclic graphs (DAGs) with domain experts for health services research, International Journal of Epidemiology, Vol: 51, ISSN: 0300-5771
Directed acyclic graphs (DAGs) are a useful tool to represent, in a graphical format, researchers’ assumptions about the causal structure among variables while providing a rationale for the choice of confounding variables to adjust for. With origins in the field of probabilistic graphical modelling, DAGs are yet to be widely adopted in applied health research, where causal assumptions are frequently made for the purpose of evaluating health services initiatives. In this context, there is still limited practical guidance on how to construct and use DAGs. Some progress has recently been made in terms of building DAGs based on studies from the literature, but an area that has received less attention is how to create DAGs from information provided by domain experts, an approach of particular importance when there is limited published information about the intervention under study. This approach offers the opportunity for findings to be more robust and relevant to patients, carers and the public, and more likely to inform policy and clinical practice. This article draws lessons from a stakeholder workshop involving patients, health care professionals, researchers, commissioners and representatives from industry, whose objective was to draw DAGs for a complex intervention—online consultation, i.e. written exchange between the patient and health care professional using an online system—in the context of the English National Health Service. We provide some initial, practical guidance to those interested in engaging with domain experts to develop DAGs.
Peach R, Arnaudon A, Barahona M, 2022, Relative, local and global dimension in complex networks., Nat Commun, Vol: 13
Dimension is a fundamental property of objects and the space in which they are embedded. Yet ideal notions of dimension, as in Euclidean spaces, do not always translate to physical spaces, which can be constrained by boundaries and distorted by inhomogeneities, or to intrinsically discrete systems such as networks. To take into account locality, finiteness and discreteness, dynamical processes can be used to probe the space geometry and define its dimension. Here we show that each point in space can be assigned a relative dimension with respect to the source of a diffusive process, a concept that provides a scale-dependent definition for local and global dimension also applicable to networks. To showcase its application to physical systems, we demonstrate that the local dimension of structural protein graphs correlates with structural flexibility, and the relative dimension with respect to the active site uncovers regions involved in allosteric communication. In simple models of epidemics on networks, the relative dimension is predictive of the spreading capability of nodes, and identifies scales at which the graph structure is predictive of infectivity. We further apply our dimension measures to neuronal networks, economic trade, social networks, ocean flows, and to the comparison of random graphs.
Rodrigues D, Kreif N, Saravanakumar K, et al., 2022, Formalising triage in general practice towards a more equitable, safe, and efficient allocation of resources, BMJ: British Medical Journal, Vol: 377, ISSN: 0959-535X
Sivan M, Greenhalgh T, Darbyshire JL, et al., 2022, LOng COvid Multidisciplinary consortium Optimising Treatments and services acrOss the NHS (LOCOMOTION): protocol for a mixed-methods study in the UK, BMJ OPEN, Vol: 12, ISSN: 2044-6055
Myall A, Price J, Peach R, et al., 2022, Predicting hospital-onset COVID-19 infections using dynamic networks of patient contact: an international retrospective cohort study, The Lancet Digital Health, ISSN: 2589-7500
Background. Real-time prediction is key to prevention and control ofhealthcare-associated infections. Contacts drive many infections, yet most risk prediction frameworks fail to account for their dynamics. We develop, test and internationally validate a real-time machine learning framework, incorporating dynamic patient contact-networks to predict individual-level hospital-onset COVID-19 infections (HOCIs).Methods. Our framework extracts patient contact-networks from routinehospital-data and combines network-derived variables with clinical andcontextual information to predict individual infection risk. We train and test the framework on HOCIs using 51,157 hospital inpatients admitted to a UK National Health Service hospital group across the first two COVID-19 surges. We validate the framework using data from a Swiss hospital group during a COVID-19 surge (40,057 inpatients), and from the same UK group post COVID-19 surges (43,375 inpatients).Findings. The framework was highly predictive across test data using allvariables types (AUC-ROC 0·89 [0·88-0·90]) and similarly predictive using onlycontact-network variables (AUC-ROC 0·88 [0·86-0·90]). Prediction was reduced when using only hospital contextual (AUC-ROC 0·82 [0·80-0·84]) or patient clinical (AUC-ROC 0·64 [0·62-0·66]) variables. A model with only three variables (network closeness, direct contacts with infectious patients (network-derived), plus hospital COVID-19 prevalence (hospital-contextual)) achieved AUC-ROC 0·85 [0·82-0·88]. Incorporating contact-network variables improved performance across both validation datasets (Geneva: AUC-ROC increased from 0·84 [0·82–0·86] to 0·88 [0·86–0·90]; UK-post-surges: AUC-ROC increased from 0·52 [0·49–0·53] to 0·68 [0·64-0·70]).Interpretation. Dynamic contact-networks ar
Saxena D, Arnaudon A, Cipolato O, et al., 2022, Sensitivity and spectral control of network lasers
Recently, random lasing in complex networks has shown efficient lasing overmore than 50 localised modes, promoted by multiple scattering over theunderlying graph. If controlled, these network lasers can lead tofast-switching multifunctional light sources with synthesised spectrum. Here,we observe both in experiment and theory high sensitivity of the network laserto the spatial shape of the pump profile, with mode intensity variation of upto 280% for a non-homogeneous 7% pump decrease. We solve the nonlinearequations within the steady state ab-initio laser theory (SALT) approximationover a graph and we show selective lasing of around 90% of the top modes,effectively programming the spectrum of the lasing networks. In our experimentswith polymer networks, this high sensitivity enables control of the lasingspectrum through non-uniform pump patterns. We propose the underlyingcomplexity of the network modes as the key element behind efficient spectralcontrol opening the way for the development of optical devices with wide impactfor on-chip photonics for communication, sensing and computation.
Freischem LJ, Barahona M, Oyarzún DA, 2022, Prediction of gene essentiality using machine learning and genome-scale metabolic models
<jats:title>Abstract</jats:title><jats:p>The identification of essential genes, i.e. those that impair cell survival when deleted, requires large growth assays of knock-out strains. The complexity and cost of such experiments has triggered a growing interest in computational methods for gene essentiality prediction. In the case of metabolic genes, Flux Balance Analysis (FBA) is widely employed to predict essentiality under the assumption that cells maximize their growth rate. However, this approach implicitly assumes that knock-out strains optimize the same objectives as the wild-type, which excludes cases in which deletions cause large changes in cell physiology to meet other objectives for survival. Here we resolve this limitation with a novel machine learning approach that predicts essentiality directly from wild-type flux distributions. We first project the wild-type FBA solution onto a mass flow graph, a digraph with reactions as nodes and edge weights proportional to the mass transfer between reactions, and then train binary classifiers on the connectivity of graph nodes. We demonstrate the efficacy of this approach using the most complete metabolic model of <jats:italic>Escherichia coli</jats:italic>, achieving near state-of-the art prediction accuracy for essential genes. Our approach suggests that wild-type FBA solutions contain enough information to predict essentiality, without the need to assume optimality of deletion strains.</jats:p>
Chrysostomou S, Roy R, Prischi F, et al., 2022, Re: Repurposed Floxacins Targeting RSK4 Prevent Chemoresistance and Metastasis in Lung and Bladder Cancer, JOURNAL OF UROLOGY, Vol: 207, Pages: 919-920, ISSN: 0022-5347
Qian Y, Expert P, Rieu T, et al., 2022, Quantifying the alignment of graph and features in deep learning, IEEE Transactions on Neural Networks and Learning Systems, Vol: 33, Pages: 1663-1672, ISSN: 1045-9227
We show that the classification performance of graph convolutional networks (GCNs) is related to the alignment between features, graph, and ground truth, which we quantify using a subspace alignment measure (SAM) corresponding to the Frobenius norm of the matrix of pairwise chordal distances between three subspaces associated with features, graph, and ground truth. The proposed measure is based on the principal angles between subspaces and has both spectral and geometrical interpretations. We showcase the relationship between the SAM and the classification performance through the study of limiting cases of GCNs and systematic randomizations of both features and graph structure applied to a constructive example and several examples of citation networks of different origins. The analysis also reveals the relative importance of the graph and features for classification purposes.
Laumann F, von Kuegelgen J, Kanashiro Uehara TH, et al., 2022, Quantitative assessment of complex interlinkages, key objectives and nexuses amongst the Sustainable Development Goals and climate change, The Lancet Planetary Health, Vol: 6, ISSN: 2542-5196
Background. Global sustainability is an enmeshed system of complex socio-economic, climato-logical and ecological interactions. The numerous objectives of the United Nations’ Sustainable Development Goals (SDGs) and the Paris Agreement have various levels of interdependence, making it difficult to ascertain the influence of changes in particular indicators across the whole system.Methods. We present a method to find interlinkages amongst the 17 SDGs and climate change, including non-linear and non-monotonic dependences, by using 400 indicators that track their temporal changes. The method detects statistically significant dependencies amongst the time evolution of the objectives by using partial distance correlations, a non-linear measure of conditional dependence that also discounts spurious correlations originating from lurking variables. We then employ a network representation to identify the most important objectives (using network centrality) and to obtain nexuses of objectives (defined as highly interconnected clusters in the network).Findings. Using temporal data from 181 countries spanning 20 years, we analyse dependencies amongst SDGs and climate for 35 country groupings based on region, development and income 2 level. Our results show that the significant interlinkages, central objectives, and nexuses identified vary greatly across country groupings, yet partnerships for the goals (SDG 17) and climate change rank as highly important across many country groupings.Temperature rise is strongly linked to urbanisation, air pollution, and slum expansion (SDG 11), especially in country groupings likely to be worst affectedby climate breakdown such as Africa. In several groupings encompassing the developing countries, a consistent nexus of strongly interconnected objectives is formed by poverty reduction (SDG 1), education (SDG 4), and economic growth (SDG 8), sometimes incorporating gender equality (SDG 5), and peace and justice (SDG 16).Interpretation. The
Myall A, Price J, Peach R, et al., 2022, Prediction of hospital-onset COVID-19 using networks of patient contact: an observational study, IMED conference, Publisher: ELSEVIER SCI LTD, Pages: S109-S110, ISSN: 1201-9712
Myall A, Peach R, Wan Y, et al., 2022, Improved contact tracing using network analysis and spatial-temporal proximity, IMED conference, Publisher: ELSEVIER SCI LTD, Pages: S20-S20, ISSN: 1201-9712
Schindler D, Clarke J, Barahona M, 2022, Multiscale mobility patterns and the restriction of human mobility under lockdown
Strict lockdown measures have been put in place in many countries around theworld to constrain human mobility in response to the unparalleled challengesposed by the COVID-19 pandemic. Here we apply network-theoretic tools toanalyse a geolocalised dataset of human mobility of 16 million UK Facebookusers from March to July 2020. A special emphasis lies on dynamicalperspectives of network analysis and multi-scale community detection withMarkov Stability analysis is performed to identify signatures for the mobilitycontraction in the UK. Thereby, a new quantitative criterion for the scaleselection in Markov Stability analysis is proposed, which reveals differentscales of mobility in a semi-automated manner. The analysis of the UK mobilitynetwork reveals a pronounced decline of human mobility under COVID-19 andsuggests that local community structure has been strengthened under lockdown.In particular, human mobility does not follow along purely geographic andadministrative lines but the flow-based approach allows for the identificationof intrinsic mobility patterns that may inform future interventions to preventCOVID-19 transmission.
Beaney T, Clarke J, Woodcock T, et al., 2021, Patterns of healthcare utilisation in children and young people: a retrospective cohort study using routinely collected healthcare data in Northwest London, BMJ Open, Vol: 11, Pages: 1-14, ISSN: 2044-6055
ObjectivesWith a growing role for health services in managing population health, there is a need for early identification of populations with high need. Segmentation approaches partition the population based on demographics, long-term conditions (LTCs) or healthcare utilisation but have mostly been applied to adults. Our study uses segmentation methods to distinguish patterns of healthcare utilisation in children and young people (CYP) and to explore predictors of segment membership.DesignRetrospective cohort study.SettingRoutinely collected primary and secondary healthcare data in Northwest London from the Discover database.Participants378,309 CYP aged 0-15 years registered to a general practice in Northwest London with one full year of follow-up.Primary and secondary outcome measuresAssignment of each participant to a segment defined by seven healthcare variables representing primary and secondary care attendances, and description of utilisation patterns by segment. Predictors of segment membership described by age, sex, ethnicity, deprivation and LTCs.ResultsParticipants were grouped into six segments based on healthcare utilisation. Three segments predominantly used primary care; two moderate utilisation segments differed in use of emergency or elective care, and a high utilisation segment, representing 16,632 (4.4%) children accounted for the highest mean presentations across all service types. The two smallest segments, representing 13.3% of the population, accounted for 62.5% of total costs. Younger age, residence in areas of higher deprivation, and presence of one or more LTCs were associated with membership of higher utilisation segments, but 75.0% of those in the highest utilisation segment had no LTC.ConclusionsThis article identifies six segments of healthcare utilisation in CYP and predictors of segment membership. Demographics and LTCs may not explain utilisation patterns as strongly as in adults which may limit the use of routine data in predicting ut
Liu Z, Peach R, Lawrance E, et al., 2021, Listening to mental health crisis needs at scale: using Natural Language Processing to understand and evaluate a mental health crisis text messaging service, Frontiers in Digital Health, Vol: 3, Pages: 1-14, ISSN: 2673-253X
The current mental health crisis is a growing public health issue requiring a large-scale response that cannot be met with traditional services alone. Digital support tools are proliferating, yet most are not systematically evaluated, and we know little about their users and their needs. Shout is a free mental health text messaging service run by the charity Mental Health Innovations, which provides support for individuals in the UK experiencing mental or emotional distress and seeking help. Here we study a large data set of anonymised text message conversations and post-conversation surveys compiled through Shout. This data provides an opportunity to hear at scale from those experiencing distress; to better understand mental health needs for people not using traditional mental health services; and to evaluate the impact of a novel form of crisis support. We use natural language processing (NLP) to assess the adherence of volunteers to conversation techniques and formats, and to gain insight into demographic user groups and their behavioural expressions of distress. Our textual analyses achieve accurate classification of conversation stages (weighted accuracy = 88%), behaviours (1-hamming loss = 95%) and texter demographics (weighted accuracy = 96%), exemplifying how the application of NLP to frontline mental health data sets can aid with post-hoc analysis and evaluation of quality of service provision in digital mental health services.
Liu Z, Barahona M, 2021, Similarity measure for sparse time course data based on Gaussian processes, Uncertainty in Artificial Intelligence 2021, Publisher: PMLR, Pages: 1332-1341
We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.
Ming DK, Myall AC, Hernandez B, et al., 2021, Informing antimicrobial management in the context of COVID-19: understanding the longitudinal dynamics of C-reactive protein and procalcitonin, BMC Infectious Diseases, Vol: 21
Background: To characterise the longitudinal dynamics of C-reactive protein (CRP) and Procalcitonin (PCT) in a cohort of hospitalised patients with COVID-19 and support antimicrobial decision-making. Methods: Longitudinal CRP and PCT concentrations and trajectories of 237 hospitalised patients with COVID-19 were modelled. The dataset comprised of 2,021 data points for CRP and 284 points for PCT. Pairwise comparisons were performed between: (i) those with or without significant bacterial growth from cultures, and (ii) those who survived or died in hospital. Results: CRP concentrations were higher over time in COVID-19 patients with positive microbiology (day 9: 236 vs 123 mg/L, p < 0.0001) and in those who died (day 8: 226 vs 152 mg/L, p < 0.0001) but only after day 7 of COVID-related symptom onset. Failure for CRP to reduce in the first week of hospital admission was associated with significantly higher odds of death. PCT concentrations were higher in patients with COVID-19 and positive microbiology or in those who died, although these differences were not statistically significant. Conclusions: Both the absolute CRP concentration and the trajectory during the first week of hospital admission are important factors predicting microbiology culture positivity and outcome in patients hospitalised with COVID-19. Further work is needed to describe the role of PCT for co-infection. Understanding relationships of these biomarkers can support development of risk models and inform optimal antimicrobial strategies.
Boonyasiri A, Myall AC, Wan Y, et al., 2021, Integrated patient network and genomic plasmid analysis reveal a regional, multi-species outbreak of carbapenemase-producing Enterobacterales carrying both <i>bla</i><sub>IMP</sub> and <i>mcr-9</i> genes
<jats:title>Abstract</jats:title><jats:p>The incidence of carbapenemase-producing Enterobacterales (CPE) is rising globally, yet Imipenemase (IMP) carbapenemases remain relatively rare. This study describes an investigation of the emergence of IMP-encoding CPE amongst diverse Enterobacterales species between 2016 and 2019 in patients across a London regional hospital network.</jats:p><jats:p>A network analysis approach to patient pathways, using routinely collected electronic health records, identified previously unrecognised contacts between patients who were IMP CPE positive on screening, implying potential bacterial transmission events. Whole genome sequencing of 85 Enterobacterales isolates from these patients revealed that 86% (73/85) were diverse species (predominantly <jats:italic>Klebsiella</jats:italic> spp, <jats:italic>Enterobacter</jats:italic> spp, <jats:italic>E. coli</jats:italic>) and harboured an IncHI2 plasmid, which carried both <jats:italic>bla</jats:italic><jats:sub>IMP</jats:sub> and the putative mobile colistin resistance gene <jats:italic>mcr-9</jats:italic>. Detailed phylogenetic analysis identified two distinct IncHI2 plasmid lineages, A and B, both of which showed significant association with patient movements between four hospital sites and across medical specialities.</jats:p><jats:p>Combined, our patient network and plasmid analyses demonstrate an interspecies, plasmid-mediated outbreak of <jats:italic>bla</jats:italic><jats:sub>IMP</jats:sub>CPE, which remained unidentified during standard microbiology and infection control investigations. With whole genome sequencing (WGS) technologies and large-data incorporation, the outbreak investigation approach proposed here provides a framework for real-time identification of key factors causing pathogen spread. Analysing outbreaks at the plasmid level reveal
Maes A, Barahona M, Clopath C, 2021, Long- and short-term history effects in a spiking network model of statistical learning
<jats:title>ABSTRACT</jats:title><jats:p>The statistical structure of the environment is often important when making decisions. There are multiple theories of how the brain represents statistical structure. One such theory states that neural activity spontaneously samples from probability distributions. In other words, the network spends more time in states which encode high-probability stimuli. Existing spiking network models implementing sampling lack the ability to learn the statistical structure from observed stimuli and instead often hard-code a dynamics. Here, we focus on how arbitrary prior knowledge about the external world can both be learned and spontaneously recollected. We present a model based upon learning the inverse of the cumulative distribution function. Learning is entirely unsupervised using biophysical neurons and biologically plausible learning rules. We show how this prior knowledge can then be accessed to compute expectations and signal surprise in downstream networks. Sensory history effects emerge from the model as a consequence of ongoing learning.</jats:p>
<jats:title>Abstract</jats:title><jats:p>Single-cell RNA sequencing (scRNA-seq) data sets consist of high-dimensional, sparse and noisy feature vectors, and pose a challenge for classic methods for dimensionality reduction. Such problems are compounded when dealing with composite data sets formed by different batches. We introduce Integrative Hierarchical Poisson Factorisation (IHPF), an extension of HPF that makes use of a noise ratio hyper-parameter to tune the variability attributed to batches <jats:italic>vs</jats:italic>. biological sources (cell phenotypes). We exemplify the application of IHPF under different data integration scenarios with varying alignments of batches and cell diversity, and show that IHPF produces latent factors that can be advantageously applied for cell clustering and visualisation. In addition, the extracted factors have a dual block structure in both cell and gene spaces with enhanced biological interpretability.</jats:p>
Mersmann S, Stromich L, Song F, et al., 2021, ProteinLens: a web-based application for the analysis of allosteric signalling on atomistic graphs of biomolecules, Nucleic Acids Research, Vol: 49, Pages: W551-W558, ISSN: 0305-1048
The investigation of allosteric effects in biomolecular structures is of great current interest in diverse areas, from fundamental biological enquiry to drug discovery. Here we present ProteinLens, a user-friendly and interactive web application for the investigation of allosteric signalling based on atomistic graph-theoretical methods. Starting from the PDB file of a biomolecule (or a biomolecular complex) ProteinLens obtains an atomistic, energy-weighted graph description of the structure of the biomolecule, and subsequently provides a systematic analysis of allosteric signalling and communication across the structure using two computationally efficient methods: Markov Transients and bond-to-bond propensities. ProteinLens scores and ranks every bond and residue according to the speed and magnitude of the propagation of fluctuations emanating from any site of choice (e.g. the active site). The results are presented through statistical quantile scores visualised with interactive plots and adjustable 3D structure viewers, which can also be downloaded. ProteinLens thus allows the investigation of signalling in biomolecular structures of interest to aid the detection of allosteric sites and pathways. ProteinLens is implemented in Python/SQL and freely available to use at: www.proteinlens.io.
Chrysostomou S, Roy R, Prischi F, et al., 2021, Repurposed floxacins targeting RSK4 prevent chemoresistance and metastasis in lung and bladder cancer., Science translational medicine, Vol: 13, ISSN: 1946-6234
Lung and bladder cancers are mostly incurable because of the early development of drug resistance and metastatic dissemination. Hence, improved therapies that tackle these two processes are urgently needed to improve clinical outcome. We have identified RSK4 as a promoter of drug resistance and metastasis in lung and bladder cancer cells. Silencing this kinase, through either RNA interference or CRISPR, sensitized tumor cells to chemotherapy and hindered metastasis in vitro and in vivo in a tail vein injection model. Drug screening revealed several floxacin antibiotics as potent RSK4 activation inhibitors, and trovafloxacin reproduced all effects of RSK4 silencing in vitro and in/ex vivo using lung cancer xenograft and genetically engineered mouse models and bladder tumor explants. Through x-ray structure determination and Markov transient and Deuterium exchange analyses, we identified the allosteric binding site and revealed how this compound blocks RSK4 kinase activation through binding to an allosteric site and mimicking a kinase autoinhibitory mechanism involving the RSK4's hydrophobic motif. Last, we show that patients undergoing chemotherapy and adhering to prophylactic levofloxacin in the large placebo-controlled randomized phase 3 SIGNIFICANT trial had significantly increased (P = 0.048) long-term overall survival times. Hence, we suggest that RSK4 inhibition may represent an effective therapeutic strategy for treating lung and bladder cancer.
Laumann F, von Kuegelgen J, Barahona M, 2021, Kernel two-sample and independence tests for non-stationary random processes, ITISE 2021 (7th International conference on Time Series and Forecasting), Publisher: https://www.mdpi.com/2673-4591/5/1/31, Pages: 1-13
Two-sample and independence tests with the kernel-based MMD and HSIC haveshown remarkable results on i.i.d. data and stationary random processes.However, these statistics are not directly applicable to non-stationary randomprocesses, a prevalent form of data in many scientific disciplines. In thiswork, we extend the application of MMD and HSIC to non-stationary settings byassuming access to independent realisations of the underlying random process.These realisations - in the form of non-stationary time-series measured on thesame temporal grid - can then be viewed as i.i.d. samples from a multivariateprobability distribution, to which MMD and HSIC can be applied. We further showhow to choose suitable kernels over these high-dimensional spaces by maximisingthe estimated test power with respect to the kernel hyper-parameters. Inexperiments on synthetic data, we demonstrate superior performance of ourproposed approaches in terms of test power when compared to currentstate-of-the-art functional or multivariate two-sample and independence tests.Finally, we employ our methods on a real socio-economic dataset as an exampleapplication.
Myall AC, Peach RL, Weiße AY, et al., 2021, Network memory in the movement of hospital patients carrying drug-resistant bacteria, Applied Network Science, Vol: 6, ISSN: 2364-8228
Hospitals constitute highly interconnected systems that bring into contact anabundance of infectious pathogens and susceptible individuals, thus makinginfection outbreaks both common and challenging. In recent years, there hasbeen a sharp incidence of antimicrobial-resistance amongsthealthcare-associated infections, a situation now considered endemic in manycountries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three largeLondon hospitals. We show that there are substantial memory effects in themovement of hospital patients colonised with drug-resistant bacteria. Suchmemory effects break first-order Markovian transitive assumptions andsubstantially alter the conclusions from the analysis, specifically on noderankings and the evolution of diffusive processes. We capture variable lengthmemory effects by constructing a lumped-state memory network, which we then useto identify overlapping communities of wards. We find that these communities ofwards display a quasi-hierarchical structure at different levels of granularitywhich is consistent with different aspects of patient flows related to hospitallocations and medical specialties.
Saavedra-Garcia P, Roman-Trufero M, Al-Sadah HA, et al., 2021, Systems level profiling of chemotherapy-induced stress resolution in cancer cells reveals druggable trade-offs, Proceedings of the National Academy of Sciences of USA, Vol: 118, ISSN: 0027-8424
Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known.Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stressresolution in multiple myeloma cells recovering from proteasome inhibition. Our observations definelayered and protracted programmes for stress resolution that encompass extensive changes acrossthe transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibitioninvolved protracted and dynamic changes of glucose and lipid metabolism and suppression ofmitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insultsthan acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellularresponse to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptomeanalysis pipeline, we further show that GCN2 is also a stress-independent bona fide target intranscriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus,identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumour cellsmay reveal new therapeutic targets and routes for cancer therapy optimisation.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.