177 results found
Ming DKY, Myall A, Hernandez B, et al., 2021, Informing antimicrobial management in the context of COVID-19: understanding the longitudinal dynamics of C-reactive protein and procalcitonin, BMC Infectious Diseases, Vol: 21, ISSN: 1471-2334
Background:To characterise the longitudinal dynamics of C-reactive protein (CRP) and Procalcitonin (PCT) in a cohort of hospitalised patients with COVID-19 and support antimicrobial decision-making.Methods:Longitudinal CRP and PCT concentrations and trajectories of 237 hospitalised patients with COVID-19 were modelled. The dataset comprised of 2,021 data points for CRP and 284 points for PCT. Pairwise comparisons were performed between: (i) those with or without significant bacterial growth from cultures, and (ii) those who survived or died in hospital.Results:CRP concentrations were higher over time in COVID-19 patients with positive microbiology (day 9: 236 vs 123 mg/L, p < 0.0001) and in those who died (day 8: 226 vs 152 mg/L, p < 0.0001) but only after day 7 of COVID-related symptom onset. Failure for CRP to reduce in the first week of hospital admission was associated with significantly higher odds of death. PCT concentrations were higher in patients with COVID-19 and positive microbiology or in those who died, although these differences were not statistically significant.Conclusions:Both the absolute CRP concentration and the trajectory during the first week of hospital admission are important factors predicting microbiology culture positivity and outcome in patients hospitalised with COVID-19. Further work is needed to describe the role of PCT for co-infection. Understanding relationships of these biomarkers can support development of risk models and inform optimal antimicrobial strategies.
<jats:title>Abstract</jats:title><jats:p>Single-cell RNA sequencing (scRNA-seq) data sets consist of high-dimensional, sparse and noisy feature vectors, and pose a challenge for classic methods for dimensionality reduction. We show that application of Hierarchical Poisson Factorisation (HPF) to scRNA-seq data produces robust factors, and outperforms other popular methods. To account for batch variability in composite data sets, we introduce Integrative Hierarchical Poisson Factorisation (IHPF), an extension of HPF that makes use of a noise ratio hyper-parameter to tune the variability attributed to technical (batches) <jats:italic>vs</jats:italic>. biological (cell phenotypes) sources. We exemplify the advantageous application of IHPF under data integration scenarios with varying alignments of technical noise and cell diversity, and show that IHPF produces latent factors with a dual block structure in both cell and gene spaces for enhanced biological interpretability.</jats:p>
Mersmann S, Stromich L, Song F, et al., 2021, ProteinLens: a web-based application for the analysis of allosteric signalling on atomistic graphs of biomolecules, Nucleic Acids Research, Vol: 49, Pages: W551-W558, ISSN: 0305-1048
The investigation of allosteric effects in biomolecular structures is of great current interest in diverse areas, from fundamental biological enquiry to drug discovery. Here we present ProteinLens, a user-friendly and interactive web application for the investigation of allosteric signalling based on atomistic graph-theoretical methods. Starting from the PDB file of a biomolecule (or a biomolecular complex) ProteinLens obtains an atomistic, energy-weighted graph description of the structure of the biomolecule, and subsequently provides a systematic analysis of allosteric signalling and communication across the structure using two computationally efficient methods: Markov Transients and bond-to-bond propensities. ProteinLens scores and ranks every bond and residue according to the speed and magnitude of the propagation of fluctuations emanating from any site of choice (e.g. the active site). The results are presented through statistical quantile scores visualised with interactive plots and adjustable 3D structure viewers, which can also be downloaded. ProteinLens thus allows the investigation of signalling in biomolecular structures of interest to aid the detection of allosteric sites and pathways. ProteinLens is implemented in Python/SQL and freely available to use at: www.proteinlens.io.
Chrysostomou S, Roy R, Prischi F, et al., 2021, Repurposed floxacins targeting RSK4 prevent chemoresistance and metastasis in lung and bladder cancer., Science translational medicine, Vol: 13, ISSN: 1946-6234
Lung and bladder cancers are mostly incurable because of the early development of drug resistance and metastatic dissemination. Hence, improved therapies that tackle these two processes are urgently needed to improve clinical outcome. We have identified RSK4 as a promoter of drug resistance and metastasis in lung and bladder cancer cells. Silencing this kinase, through either RNA interference or CRISPR, sensitized tumor cells to chemotherapy and hindered metastasis in vitro and in vivo in a tail vein injection model. Drug screening revealed several floxacin antibiotics as potent RSK4 activation inhibitors, and trovafloxacin reproduced all effects of RSK4 silencing in vitro and in/ex vivo using lung cancer xenograft and genetically engineered mouse models and bladder tumor explants. Through x-ray structure determination and Markov transient and Deuterium exchange analyses, we identified the allosteric binding site and revealed how this compound blocks RSK4 kinase activation through binding to an allosteric site and mimicking a kinase autoinhibitory mechanism involving the RSK4's hydrophobic motif. Last, we show that patients undergoing chemotherapy and adhering to prophylactic levofloxacin in the large placebo-controlled randomized phase 3 SIGNIFICANT trial had significantly increased (<i>P</i> = 0.048) long-term overall survival times. Hence, we suggest that RSK4 inhibition may represent an effective therapeutic strategy for treating lung and bladder cancer.
Laumann F, Kügelgen JV, Barahona M, 2021, Kernel two-sample and independence tests for non-stationary random processes, ITISE 2021 (7th International conference on Time Series and Forecasting), Publisher: https://www.mdpi.com/2673-4591/5/1/31, Pages: 1-13
Two-sample and independence tests with the kernel-based MMD and HSIC haveshown remarkable results on i.i.d. data and stationary random processes.However, these statistics are not directly applicable to non-stationary randomprocesses, a prevalent form of data in many scientific disciplines. In thiswork, we extend the application of MMD and HSIC to non-stationary settings byassuming access to independent realisations of the underlying random process.These realisations - in the form of non-stationary time-series measured on thesame temporal grid - can then be viewed as i.i.d. samples from a multivariateprobability distribution, to which MMD and HSIC can be applied. We further showhow to choose suitable kernels over these high-dimensional spaces by maximisingthe estimated test power with respect to the kernel hyper-parameters. Inexperiments on synthetic data, we demonstrate superior performance of ourproposed approaches in terms of test power when compared to currentstate-of-the-art functional or multivariate two-sample and independence tests.Finally, we employ our methods on a real socio-economic dataset as an exampleapplication.
Peach R, Arnaudon A, Barahona M, 2021, Relative, local and global dimension in complex networks
<jats:title>Abstract</jats:title> <jats:p>Dimension is a fundamental property of objects and the space in which they are embedded. Yet ideal notions of dimension, as in Euclidean spaces, do not always translate to physical spaces, which can be constrained by boundaries and distorted by inhomogeneities, or to intrinsically discrete systems such as networks. To take into account locality, finiteness and discreteness, dynamical processes can be used to probe the space geometry and define its dimension. Here we show that each point in space can be assigned a relative dimension with respect to the source of a diffusive process, a concept that provides a scale-dependent definition for local and global dimension also applicable to networks. To showcase its application to physical systems, we demonstrate that the local dimension of structural protein graphs correlates with structural flexibility, and the relative dimension with respect to the active site uncovers regions involved in allosteric communication. In simple models of epidemics on networks, the relative dimension is predictive of the spreading capability of nodes, and identifies scales at which the graph structure is predictive of infectivity.</jats:p>
Myall AC, Peach RL, Weiße AY, et al., 2021, Network memory in the movement of hospital patients carrying drug-resistant bacteria, Applied Network Science, Vol: 6, ISSN: 2364-8228
Hospitals constitute highly interconnected systems that bring into contact anabundance of infectious pathogens and susceptible individuals, thus makinginfection outbreaks both common and challenging. In recent years, there hasbeen a sharp incidence of antimicrobial-resistance amongsthealthcare-associated infections, a situation now considered endemic in manycountries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three largeLondon hospitals. We show that there are substantial memory effects in themovement of hospital patients colonised with drug-resistant bacteria. Suchmemory effects break first-order Markovian transitive assumptions andsubstantially alter the conclusions from the analysis, specifically on noderankings and the evolution of diffusive processes. We capture variable lengthmemory effects by constructing a lumped-state memory network, which we then useto identify overlapping communities of wards. We find that these communities ofwards display a quasi-hierarchical structure at different levels of granularitywhich is consistent with different aspects of patient flows related to hospitallocations and medical specialties.
Saavedra-Garcia P, Roman-Trufero M, Al-Sadah HA, et al., 2021, Systems level profiling of chemotherapy-induced stress resolution in cancer cells reveals druggable trade-offs, Proceedings of the National Academy of Sciences of USA, Vol: 118, ISSN: 0027-8424
Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known.Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stressresolution in multiple myeloma cells recovering from proteasome inhibition. Our observations definelayered and protracted programmes for stress resolution that encompass extensive changes acrossthe transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibitioninvolved protracted and dynamic changes of glucose and lipid metabolism and suppression ofmitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insultsthan acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellularresponse to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptomeanalysis pipeline, we further show that GCN2 is also a stress-independent bona fide target intranscriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus,identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumour cellsmay reveal new therapeutic targets and routes for cancer therapy optimisation.
Myall A, Peach RL, Wan Y, et al., 2021, Characterising contact in disease outbreaks via a network model of spatial-temporal proximity
<jats:title>ABSTRACT</jats:title><jats:p>Contact tracing is a key tool in epidemiology to identify and control outbreaks of infectious diseases. Existing contact tracing methodologies produce contact maps of individuals based on a binary definition of contact which can be hampered by missing data and indirect contacts. Here, we present a Spatial-temporal Epidemiological Proximity (StEP) model to recover contact maps in disease outbreaks based on movement data. The StEP model accounts for imperfect data by considering probabilistic contacts between individuals based on spatial-temporal proximity of their movement trajectories, creating a robust movement network despite possible missing data and unseen transmission routes. Using real-world data we showcase the potential of StEP for contact tracing with outbreaks of multidrug-resistant bacteria and COVID-19 in a large hospital group in London, UK. In addition to the core structure of contacts that can be recovered using traditional methods of contact tracing, the StEP model reveals missing contacts that connect seemingly separate outbreaks. Comparison with genomic data further confirmed that these recovered contacts indeed improve characterisation of disease transmission and so highlights how the StEP framework can inform effective strategies of infection control and prevention.</jats:p>
Qian Y, Expert P, Panzarasa P, et al., 2021, Geometric graphs from data to aid classification tasks with Graph Convolutional Networks, Patterns, Vol: 2, Pages: 100237-100237, ISSN: 2666-3899
Peach RL, Arnaudon A, Schmidt JA, et al., 2021, HCGA: Highly comparative graph analysis for network phenotyping, Patterns, Vol: 2, Pages: 100227-100227, ISSN: 2666-3899
<jats:title>A<jats:sc>bstract</jats:sc></jats:title><jats:p>Networks are widely used as mathematical models of complex systems across many scientific disciplines, not only in biology and medicine but also in the social sciences, physics, computing and engineering. Decades of work have produced a vast corpus of research characterising the topological, combinatorial, statistical and spectral properties of graphs. Each graph property can be thought of as a feature that captures important (and some times overlapping) characteristics of a network. In the analysis of real-world graphs, it is crucial to integrate systematically a large number of diverse graph features in order to characterise and classify networks, as well as to aid network-based scientific discovery. In this paper, we introduce HCGA, a framework for highly comparative analysis of graph data sets that computes several thousands of graph features from any given network. HCGA also offers a suite of statistical learning and data analysis tools for automated identification and selection of important and interpretable features underpinning the characterisation of graph data sets. We show that HCGA outperforms other methodologies on supervised classification tasks on benchmark data sets whilst retaining the interpretability of network features. We also illustrate how HCGA can be used for network-based discovery through two examples where data is naturally represented as graphs: the clustering of a data set of images of neuronal morphologies, and a regression problem to predict charge transfer in organic semiconductors based on their structure. HCGA is an open platform that can be expanded to include further graph properties and statistical learning tools to allow researchers to leverage the wide breadth of graph-theoretical research to quantitatively analyse and draw insights from network data.</jats:p>
Maes A, Barahona M, Clopath C, 2021, Learning compositional sequences with multiple time scales through a hierarchical network of spiking neurons, PLoS Computational Biology, Vol: 17, ISSN: 1553-734X
Sequential behaviour is often compositional and organised across multiple time scales: a set of individual elements developing on short time scales (motifs) are combined to form longer functional sequences (syntax). Such organisation leads to a natural hierarchy that can be used advantageously for learning, since the motifs and the syntax can be acquired independently. Despite mounting experimental evidence for hierarchical structures in neuroscience, models for temporal learning based on neuronal networks have mostly focused on serial methods. Here, we introduce a network model of spiking neurons with a hierarchical organisation aimed at sequence learning on multiple time scales. Using biophysically motivated neuron dynamics and local plasticity rules, the model can learn motifs and syntax independently. Furthermore, the model can relearn sequences efficiently and store multiple sequences. Compared to serial learning, the hierarchical model displays faster learning, more flexible relearning, increased capacity, and higher robustness to perturbations. The hierarchical model redistributes the variability: it achieves high motif fidelity at the cost of higher variability in the between-motif timings.
Liu Z, Barahona M, 2021, Similarity Measure for Sparse Time Course Data Based on Gaussian Processes
<jats:title>Abstract</jats:title><jats:p>We propose a similarity measure for sparsely sampled time course data in the form of a loglikelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.</jats:p>
Kuntz Nussio J, Thomas P, Stan G, et al., 2021, Approximations of countably-infinite linear programs over bounded measure spaces, SIAM Journal on Optimization, Vol: 31, Pages: 604-625, ISSN: 1052-6234
We study a class of countably-infinite-dimensional linear programs (CILPs)whose feasible sets are bounded subsets of appropriately defined spaces ofmeasures. The optimal value, optimal points, and minimal points of these CILPscan be approximated by solving finite-dimensional linear programs. We show howto construct finite-dimensional programs that lead to approximations witheasy-to-evaluate error bounds, and we prove that the errors converge to zero asthe size of the finite-dimensional programs approaches that of the originalproblem. We discuss the use of our methods in the computation of the stationarydistributions, occupation measures, and exit distributions of Markov~chains.
Peach R, Greenbury S, Johnston I, et al., 2021, Understanding learner behaviour in online courses with Bayesian modelling and time series characterisation, Scientific Reports, Vol: 11, ISSN: 2045-2322
The intrinsic temporality of learning demands the adoption of methodologies capable of exploiting time-series information. In this study we leverage the sequence data framework and show how data-driven analysis of temporal sequences of task completion in online courses can be used to characterise personal and group learners’ behaviors, and to identify critical tasks and course sessions in a given course design. We also introduce a recently developed probabilistic Bayesian model to learn sequential behaviours of students and predict student performance. The application of our data-driven sequence-based analyses to data from learners undertaking an on-line Business Management course reveals distinct behaviors within the cohort of learners, identifying learners or groups of learners that deviate from the nominal order expected in the course. Using course grades a posteriori, we explore differences in behavior between high and low performing learners. We find that high performing learners follow the progression between weekly sessions more regularly than low performing learners, yet within each weekly session high performing learners are less tied to the nominal task order. We then model the sequences of high and low performance students using the probablistic Bayesian model and show that we can learn engagement behaviors associated with performance. We also show that the data sequence framework can be used for task-centric analysis; we identify critical junctures and differences among types of tasks within the course design. We find that non-rote learning tasks, such as interactive tasks or discussion posts, are correlated with higher performance. We discuss the application of such analytical techniques as an aid to course design, intervention, and student supervision.
Dusad V, Thiel D, Barahona M, et al., 2021, Opportunities at the interface of network science and metabolic modelling, Frontiers in Bioengineering and Biotechnology, Vol: 8, ISSN: 2296-4185
Metabolism plays a central role in cell physiology because it provides the molecular machinery for growth. At the genome-scale, metabolism is made up of thousands of reactions interacting with one another. Untangling this complexity is key to understand how cells respond to genetic, environmental, or therapeutic perturbations. Here we discuss the roles of two complementary strategies for the analysis of genome-scale metabolic models: Flux Balance Analysis (FBA) and network science. While FBA estimates metabolic flux on the basis of an optimization principle, network approaches reveal emergent properties of the global metabolic connectivity. We highlight how the integration of both approaches promises to deliver insights on the structure and function of metabolic systems with wide-ranging implications in discovery science, precision medicine and industrial biotechnology.
Altuncu T, Yaliraki S, Barahona M, 2021, Graph-based topic extraction from vector embeddings of text documents: application to a corpus of news articles, Complex Networks & Their Applications IX, Editors: Benito, Cherifi, Cherifi, Moro, Rocha, Sales-Pardo, Publisher: Springer International Publishing, Pages: 154-166, ISBN: 978-3-030-65351-4
Production of news content is growing at an astonishing rate. To help manage and monitor the sheer amount of text, there is an increasing need to develop efficient methods that can provide insights into emerging content areas, and stratify unstructured corpora of text into ‘topics’ that stem intrinsically from content similarity. Here we present an unsupervised framework that brings together powerful vector embeddings from natural language processing with tools from multiscale graph partitioning that can revealnatural partitions at different resolutions without making a priori assumptions about the number of clusters in the corpus. We show the advantages of graph-based clustering through end-to-end comparisons with other popular clustering and topic modelling methods, and also evaluate different text vector embeddings, from classic Bag-of-Words to Doc2Vec to the recent transformers based model Bert. This comparative work is showcased through an analysis of a corpus of US news coverage during the presidential election year of 2016.
Schreglmann SR, Wang D, Peach RL, et al., 2021, Non-invasive suppression of essential tremor via phase-locked disruption of its temporal coherence, NATURE COMMUNICATIONS, Vol: 12, ISSN: 2041-1723
Qian Y, Expert P, Rieu T, et al., 2021, Quantifying the alignment of graph and features in deep learning, IEEE Transactions on Neural Networks and Learning Systems, Pages: 1-10, ISSN: 1045-9227
We show that the classification performance of graph convolutional networks (GCNs) is related to the alignment between features, graph, and ground truth, which we quantify using a subspace alignment measure (SAM) corresponding to the Frobenius norm of the matrix of pairwise chordal distances between three subspaces associated with features, graph, and ground truth. The proposed measure is based on the principal angles between subspaces and has both spectral and geometrical interpretations. We showcase the relationship between the SAM and the classification performance through the study of limiting cases of GCNs and systematic randomizations of both features and graph structure applied to a constructive example and several examples of citation networks of different origins. The analysis also reveals the relative importance of the graph and features for classification purposes.
Price JR, Mookerjee S, Dyakova E, et al., 2021, Development and delivery of a real-time hospital-onset COVID-19 surveillance system using network analysis, Clinical Infectious Diseases, Vol: 72, Pages: 82-89, ISSN: 1058-4838
BackgroundUnderstanding nosocomial acquisition, outbreaks and transmission chains in real-time will be fundamental to ensuring infection prevention measures are effective in controlling COVID-19 in healthcare. We report the design and implementation of a hospital-onset COVID-19 infection (HOCI) surveillance system for an acute healthcare setting to target prevention interventions.MethodsThe study took place in a large teaching hospital group in London, UK. All patients tested for SARS-CoV-2 between 4th March and 14th April 2020 were included. Utilising data routinely collected through electronic healthcare systems we developed a novel surveillance system for determining and reporting HOCI incidence and providing real-time network analysis. We provided daily reports on incidence and trends over time to support HOCI investigation, and generated geo-temporal reports using network analysis to interrogate admission pathways for common epidemiological links to infer transmission chains. By working with stakeholders the reports were co-designed for end users.ResultsReal-time surveillance reports revealed: changing rates of HOCI throughout the course of the COVID-19 epidemic; key wards fuelling probable transmission events; HOCIs over-represented in particular specialities managing high-risk patients; the importance of integrating analysis of individual prior pathways; and the value of co-design in producing data visualisation. Our surveillance system can effectively support national surveillance.ConclusionsThrough early analysis of the novel surveillance system we have provided a description of HOCI rates and trends over time using real-time shifting denominator data. We demonstrate the importance of including the analysis of patient pathways and networks in characterising risk of transmission and targeting infection control interventions.
Kuntz J, Thomas P, Stan G-B, et al., 2021, Stationary Distributions of Continuous-Time Markov Chains: A Review of Theory and Truncation-Based Approximations, SIAM Review, Vol: 63, Pages: 3-64, ISSN: 0036-1445
Strömich L, Wu N, Barahona M, et al., 2020, Allosteric hotspots in the main protease of SARS-CoV-2
<jats:title>Abstract</jats:title> <jats:p>Inhibiting the main protease of SARS-CoV-2 is of great interest in tackling the COVID-19 pandemic caused by the virus. Most efforts have been centred on inhibiting the binding site of the enzyme. However, considering allosteric sites, distant from the active or orthosteric site, broadens the search space for drug candidates and confers the advantages of allosteric drug targeting. Here, we report the allosteric communication pathways in the main protease dimer by using two novel fully atomistic graph theoretical methods: bond-to-bond propensity analysis, which has been previously successful in identifying allosteric sites without <jats:italic>a priori</jats:italic> knowledge in benchmark data sets, and, Markov transient analysis, which has previously aided in finding novel drug targets in catalytic protein families. We further score the highest-ranking sites against random sites in similar distances through statistical bootstrapping and identify four statistically significant putative allosteric sites as good candidates for alternative drug targeting.</jats:p>
Clarke J, Murray A, Markar S, et al., 2020, A new geographic model of care to manage the post-COVID-19 elective surgery aftershock in England: a retrospective observational study, BMJ Open, Vol: 10, Pages: 1-9, ISSN: 2044-6055
Objectives The suspension of elective surgery during the COVID pandemic is unprecedented and has resulted in record volumes of patients waiting for operations. Novel approaches that maximise capacity and efficiency of surgical care are urgently required. This study applies Markov Multiscale Community Detection (MMCD), an unsupervised graph-based clustering framework, to identify new surgical care models based on pooled waiting lists delivered across an expanded network of surgical providers. DesignRetrospective observational study using Hospital Episode Statistics.SettingPublic and private hospitals providing surgical care to National Health Service (NHS) patients in England. ParticipantsAll adult patients resident in England undergoing NHS-funded planned surgical procedures between 1st April 2017 and 31st March 2018. Main outcome measuresThe identification of the most common planned surgical procedures in England (High Volume Procedures – HVP) and proportion of low, medium and high-risk patients undergoing each HVP. The mapping of hospitals providing surgical care onto optimised groupings based on patient usage data.ResultsA total of 7,811,891 planned operations were identified in 4,284,925 adults during the one-year period of our study. The 28 most common surgical procedures accounted for a combined 3,907,474 operations (50.0% of the total). 2,412,613 (61.7%) of these most common procedures involved ‘low risk’ patients. Patients travelled an average of 11.3 km for these procedures. Based on the data, MMCD partitioned England into 45, 16 and 7 mutually exclusive and collectively exhaustive natural surgical communities of increasing coarseness. The coarser partitions into 16 and 7 surgical communities were shown to be associated with balanced supply and demand for surgical care within communities.ConclusionsPooled waiting lists for low risk elective procedures and patients across integrated, expanded natural surgical community networks have the pot
Clarke J, Beaney T, Majeed A, et al., 2020, Identifying Naturally Occurring Communities of Primary Care Providers in the English National Health Service in London, AcademyHealth Annual Research Meeting (ARM), Publisher: WILEY, Pages: 107-108, ISSN: 0017-9124
Clarke J, Beaney T, Majeed A, et al., 2020, Identifying naturally occurring communities of primary care providers in the English National Health Service in London, BMJ Open, Vol: 10, Pages: 1-7, ISSN: 2044-6055
Objectives - Primary Care Networks (PCNs) are a new organisational hierarchy with wide-ranging responsibilities introduced in the National Health Service (NHS) Long Term Plan. The vision is that they represent ‘natural’ communities of general practices (GP practices) working together at scale and covering a geography that make sense to practices, other healthcare providers and local communities. Our study aims to identify natural communities of GP practices based on patient registration patterns using Markov Multiscale Community Detection, an unsupervised network-based clustering technique to create catchments for these communities.Design - Retrospective observational study using Hospital Episode Statistics – patient-level administrative records of inpatient, outpatient and emergency department attendances to hospital.Setting – General practices in the 32 Clinical Commissioning Groups of Greater London Participants - All adult patients resident in and registered to a GP practices in Greater London that had one or more outpatient encounters at NHS hospital trusts between 1st April 2017 and 31st March 2018.Main outcome measures The allocation of GP practices in Greater London to PCNs based on the registrations of patients resident in each Lower Super Output Area (LSOA) of Greater London. The population size and coverage of each proposed PCN. Results - 3,428,322 unique patients attended 1,334 GPs in 4,835 LSOAs in Greater London. Our model grouped 1,291 GPs (96.8%) and 4,721 LSOAs (97.6%), into 165 mutually exclusive PCNs. The median PCN list size was 53,490, with a lower quartile of 38,079 patients and an upper quartile of 72,982 patients. A median of 70.1% of patients attended a GP within their allocated PCN, ranging from 44.6% to 91.4%.Conclusions - With PCNs expected to take a role in population health management and with community providers expected to reconfigure around them, it is vital we recognise how PCNs represent their communities. O
Arnaudon A, Peach R, Barahona M, 2020, Scale-dependent measure of network centrality from diffusion dynamics, Physical Review Research, Vol: 2, ISSN: 2643-1564
Classic measures of graph centrality capture distinct aspects of node importance, from the local (e.g., degree) to the global (e.g., closeness). Here we exploit the connection between diffusion and geometry to introduce a multiscale centrality measure. A node is defined to be central if it breaks the metricity of the diffusion as a consequence of the effective boundaries and inhomogeneities in the graph. Our measure is naturally multiscale, as it is computed relative to graph neighbourhoods within the varying time horizon of the diffusion. We find that the centrality of nodes can differ widely at different scales. In particular, our measure correlates with degree (i.e., hubs) at small scales and with closeness (i.e., bridges) at large scales, and also reveals the existence of multi-centric structures in complex networks. By examining centrality across scales, our measure thus provides an evaluation of node importance relative to local and global processes on the network.
Yu YW, Delvenne J-C, Yaliraki SN, et al., 2020, Severability of mesoscale components and local time scales in dynamical networks
A major goal of dynamical systems theory is the search for simplifieddescriptions of the dynamics of a large number of interacting states. Foroverwhelmingly complex dynamical systems, the derivation of a reduceddescription on the entire dynamics at once is computationally infeasible. Othercomplex systems are so expansive that despite the continual onslaught of newdata only partial information is available. To address this challenge, wedefine and optimise for a local quality function severability for measuring thedynamical coherency of a set of states over time. The theoretical underpinningsof severability lie in our local adaptation of the Simon-Ando-Fisher time-scaleseparation theorem, which formalises the intuition of local wells in the Markovlandscape of a dynamical process, or the separation between a microscopic and amacroscopic dynamics. Finally, we demonstrate the practical relevance ofseverability by applying it to examples drawn from power networks, imagesegmentation, social networks, metabolic networks, and word association.
Beaney T, Clarke J, Barahona M, et al., 2020, A primary care network analysis: natural communities of general practices in London, Publisher: Royal College of General Practitioners, ISSN: 0960-1643
BACKGROUND: Primary care networks (PCNs) are a new organisational hierarchy introduced in the NHS Long Term Plan with wide-ranging responsibilities. The vision is that they represent 'natural' communities of general practices with boundaries that make sense to practices, other healthcare providers, and local communities. AIM: Our study aims to identify natural communities of general practices based on patient registration patterns, using network analysis methods and unsupervised clustering to create catchments for these communities. METHOD: Patients resident in and attending GP practices in London were identified from Hospital Episode Statistics from 2017 to 2018. We used a series of novel methods for unsupervised graph clustering. A cosine similarity matrix was constructed representing similarities between each general practice to each other, based on registration of patients in each Lower Super Output Area (LSOA). Unsupervised graph partitioning using Markov Multiscale Community Detection was conducted to identify communities of general practices. Catchments were assigned to each PCN based on the majority attendance from an LSOA. RESULTS: In total 3 428 322 unique patients attended 1334 GPs in general practices LSOAs in London. The model grouped 1291 general practices (96.8%) and 4721 LSOAs (97.6%), into 165 mutually exclusive PCNs. The median PCN list size was 53 490 and a median of 70.1% of patients attended a general practice within their allocated PCN, ranging from 44.6% to 91.4%. CONCLUSION: With PCNs expected to take a role in population health management and with community providers expected to reconfigure around them, it is vital we recognise how PCNs represent their communities. This method may be used by policymakers to understand the populations and geography shared between networks.
Lamprinakou S, McCoy E, Barahona M, et al., 2020, BART-based inference for Poisson processes
The effectiveness of Bayesian Additive Regression Trees (BART) has beendemonstrated in a variety of contexts including non parametric regression andclassification. Here we introduce a BART scheme for estimating the intensity ofinhomogeneous Poisson Processes. Poisson intensity estimation is a vital taskin various applications including medical imaging, astrophysics and networktraffic analysis. Our approach enables full posterior inference of theintensity in a nonparametric regression setting. We demonstrate the performanceof our scheme through simulation studies on synthetic and real datasets in oneand two dimensions, and compare our approach to alternative approaches.
Laumann F, Kügelgen JV, Barahona M, 2020, Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals
The United Nations' ambitions to combat climate change and prosper humandevelopment are manifested in the Paris Agreement and the SustainableDevelopment Goals (SDGs), respectively. These are inherently inter-linked asprogress towards some of these objectives may accelerate or hinder progresstowards others. We investigate how these two agendas influence each other bydefining networks of 18 nodes, consisting of the 17 SDGs and climate change,for various groupings of countries. We compute a non-linear measure ofconditional dependence, the partial distance correlation, given any subset ofthe remaining 16 variables. These correlations are treated as weights on edges,and weighted eigenvector centralities are calculated to determine the mostimportant nodes. We find that SDG 6, clean water and sanitation, and SDG 4,quality education, are most central across nearly all groupings of countries.In developing regions, SDG 17, partnerships for the goals, is stronglyconnected to the progress of other objectives in the two agendas whilst,somewhat surprisingly, SDG 8, decent work and economic growth, is not asimportant in terms of eigenvector centrality.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.