Research & Publications
Most of the members of this group are from the Statistics Section and Biomaths research group of the Department of Mathematics. Below you can find a list of research areas that members of this group are currently working on and/or would like to work on by applying their developed mathematical and statistical methods.
Statistical genomics and Epidemiology
Precision and Stratified Medicine
Analysis of clinical trials, observational and longitudinal studies
Infectious Disease Epidemiology
- Showing results for:
- Reset all filters
Journal articleBeaney T, Clarke J, Woodcock T, et al., 2021,
Patterns of healthcare utilisation in children and young people: a retrospective cohort study using routinely collected healthcare data in Northwest London, BMJ Open, Vol: 11, Pages: 1-14, ISSN: 2044-6055
ObjectivesWith a growing role for health services in managing population health, there is a need for early identification of populations with high need. Segmentation approaches partition the population based on demographics, long-term conditions (LTCs) or healthcare utilisation but have mostly been applied to adults. Our study uses segmentation methods to distinguish patterns of healthcare utilisation in children and young people (CYP) and to explore predictors of segment membership.DesignRetrospective cohort study.SettingRoutinely collected primary and secondary healthcare data in Northwest London from the Discover database.Participants378,309 CYP aged 0-15 years registered to a general practice in Northwest London with one full year of follow-up.Primary and secondary outcome measuresAssignment of each participant to a segment defined by seven healthcare variables representing primary and secondary care attendances, and description of utilisation patterns by segment. Predictors of segment membership described by age, sex, ethnicity, deprivation and LTCs.ResultsParticipants were grouped into six segments based on healthcare utilisation. Three segments predominantly used primary care; two moderate utilisation segments differed in use of emergency or elective care, and a high utilisation segment, representing 16,632 (4.4%) children accounted for the highest mean presentations across all service types. The two smallest segments, representing 13.3% of the population, accounted for 62.5% of total costs. Younger age, residence in areas of higher deprivation, and presence of one or more LTCs were associated with membership of higher utilisation segments, but 75.0% of those in the highest utilisation segment had no LTC.ConclusionsThis article identifies six segments of healthcare utilisation in CYP and predictors of segment membership. Demographics and LTCs may not explain utilisation patterns as strongly as in adults which may limit the use of routine data in predicting ut
Journal articleLiu Z, Peach R, Lawrance E, et al., 2021,
Listening to mental health crisis needs at scale: using Natural Language Processing to understand and evaluate a mental health crisis text messaging service, Frontiers in Digital Health, Vol: 3, Pages: 1-14, ISSN: 2673-253X
The current mental health crisis is a growing public health issue requiring a large-scale response that cannot be met with traditional services alone. Digital support tools are proliferating, yet most are not systematically evaluated, and we know little about their users and their needs. Shout is a free mental health text messaging service run by the charity Mental Health Innovations, which provides support for individuals in the UK experiencing mental or emotional distress and seeking help. Here we study a large data set of anonymised text message conversations and post-conversation surveys compiled through Shout. This data provides an opportunity to hear at scale from those experiencing distress; to better understand mental health needs for people not using traditional mental health services; and to evaluate the impact of a novel form of crisis support. We use natural language processing (NLP) to assess the adherence of volunteers to conversation techniques and formats, and to gain insight into demographic user groups and their behavioural expressions of distress. Our textual analyses achieve accurate classification of conversation stages (weighted accuracy = 88%), behaviours (1-hamming loss = 95%) and texter demographics (weighted accuracy = 96%), exemplifying how the application of NLP to frontline mental health data sets can aid with post-hoc analysis and evaluation of quality of service provision in digital mental health services.
Conference paperLiu Z, Barahona M, 2021,
Similarity measure for sparse time course data based on Gaussian processes, Uncertainty in Artificial Intelligence 2021, Publisher: PMLR, Pages: 1332-1341
We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.
Journal articleMyall AC, Peach RL, Weiße AY, et al., 2021,
Network memory in the movement of hospital patients carrying drug-resistant bacteria, Applied Network Science, Vol: 6, ISSN: 2364-8228
Hospitals constitute highly interconnected systems that bring into contact anabundance of infectious pathogens and susceptible individuals, thus makinginfection outbreaks both common and challenging. In recent years, there hasbeen a sharp incidence of antimicrobial-resistance amongsthealthcare-associated infections, a situation now considered endemic in manycountries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three largeLondon hospitals. We show that there are substantial memory effects in themovement of hospital patients colonised with drug-resistant bacteria. Suchmemory effects break first-order Markovian transitive assumptions andsubstantially alter the conclusions from the analysis, specifically on noderankings and the evolution of diffusive processes. We capture variable lengthmemory effects by constructing a lumped-state memory network, which we then useto identify overlapping communities of wards. We find that these communities ofwards display a quasi-hierarchical structure at different levels of granularitywhich is consistent with different aspects of patient flows related to hospitallocations and medical specialties.
Journal articleSaavedra-Garcia P, Roman-Trufero M, Al-Sadah HA, et al., 2021,
Systems level profiling of chemotherapy-induced stress resolution in cancer cells reveals druggable trade-offs, Proceedings of the National Academy of Sciences of USA, Vol: 118, ISSN: 0027-8424
Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known.Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stressresolution in multiple myeloma cells recovering from proteasome inhibition. Our observations definelayered and protracted programmes for stress resolution that encompass extensive changes acrossthe transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibitioninvolved protracted and dynamic changes of glucose and lipid metabolism and suppression ofmitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insultsthan acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellularresponse to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptomeanalysis pipeline, we further show that GCN2 is also a stress-independent bona fide target intranscriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus,identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumour cellsmay reveal new therapeutic targets and routes for cancer therapy optimisation.
Journal articleClarke JM, Warren LR, Arora S, et al., 2018,
Guiding interoperable electronic health records through patient-sharing networks., NPJ digital medicine, Vol: 1, Pages: 65-65, ISSN: 2398-6352
Effective sharing of clinical information between care providers is a critical component of a safe, efficient health system. National data-sharing systems may be costly, politically contentious and do not reflect local patterns of care delivery. This study examines hospital attendances in England from 2013 to 2015 to identify instances of patient sharing between hospitals. Of 19.6 million patients receiving care from 155 hospital care providers, 130 million presentations were identified. On 14.7 million occasions (12%), patients attended a different hospital to the one they attended on their previous interaction. A network of hospitals was constructed based on the frequency of patient sharing between hospitals which was partitioned using the Louvain algorithm into ten distinct data-sharing communities, improving the continuity of data sharing in such instances from 0 to 65-95%. Locally implemented data-sharing communities of hospitals may achieve effective accessibility of clinical information without a large-scale national interoperable information system.
Journal articleAryaman J, Johnston IG, Jones NS, 2017,
Mitochondrial DNA Density Homeostasis Accounts for a Threshold Effect in a Cybrid Model of a Human Mitochondrial Disease, Biochemical Journal, Vol: 474, Pages: 4019-4034, ISSN: 1470-8728
Mitochondrial dysfunction is involved in a wide array of devastating diseases, but the heterogeneity and complexity of the symptoms of these diseases challenges theoretical understanding of their causation. With the explosion of omics data, we have the unprecedented opportunity to gain deep understanding of the biochemical mechanisms of mitochondrial dysfunction. This goal raises the outstanding need to make these complex datasets interpretable. Quantitative modelling allows us to translate such datasets into intuition and suggest rational biomedical treatments. Taking an interdisciplinary approach, we use a recently published large-scale dataset and develop a descriptive and predictive mathematical model of progressive increase in mutant load of the MELAS 3243A>G mtDNA mutation. The experimentally observed behaviour is surprisingly rich, but we find that our simple, biophysically motivated model intuitively accounts for this heterogeneity and yields a wealth of biological predictions. Our findings suggest that cells attempt to maintain wild-type mtDNA density through cell volume reduction, and thus power demand reduction, until a minimum cell volume is reached. Thereafter, cells toggle from demand reduction to supply increase, up-regulating energy production pathways. Our analysis provides further evidence for the physiological significance of mtDNA density and emphasizes the need for performing single-cell volume measurements jointly with mtDNA quantification. We propose novel experiments to verify the hypotheses made here to further develop our understanding of the threshold effect and connect with rational choices for mtDNA disease therapies.
Journal articleFulcher B, Jones NS, 2017,
hctsa: A computational framework for automated timeseriesphenotyping using massive feature extraction, Cell Systems, Vol: 5, Pages: 527-531.e3, ISSN: 2405-4712
Phenotype measurements frequently take the form of time series, but we currently lack a systematic method for relating these complex data streams to scientifically meaningful outcomes, such as relating the movement dynamics of organisms to their genotype or measurements of brain dynamics of a patient to their disease diagnosis. Previous work addressed this problem by comparing implementations of thousands of diverse scientific time-series analysis methods in an approach termed highly comparative time-series analysis. Here, we introduce hctsa, a software tool for applying this methodological approach to data. hctsa includes an architecture for computing over 7,700 time-series features and a suite of analysis and visualization algorithms to automatically select useful and interpretable time-series features for a given application. Using exemplar applications to high-throughput phenotyping experiments, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in time-series data.
Journal articleGriffie J, Shlomovich L, Williamson D, et al., 2017,
3D Bayesian cluster analysis of super-resolution data reveals LAT recruitment to the T cell synapse, Scientific Reports, Vol: 7, ISSN: 2045-2322
Single-molecule localisation microscopy (SMLM) allows the localisation of fluorophores with a precision of 10–30 nm, revealing the cell’s nanoscale architecture at the molecular level. Recently, SMLM has been extended to 3D, providing a unique insight into cellular machinery. Although cluster analysis techniques have been developed for 2D SMLM data sets, few have been applied to 3D. This lack of quantification tools can be explained by the relative novelty of imaging techniques such as interferometric photo-activated localisation microscopy (iPALM). Also, existing methods that could be extended to 3D SMLM are usually subject to user defined analysis parameters, which remains a major drawback. Here, we present a new open source cluster analysis method for 3D SMLM data, free of user definable parameters, relying on a model-based Bayesian approach which takes full account of the individual localisation precisions in all three dimensions. The accuracy and reliability of the method is validated using simulated data sets. This tool is then deployed on novel experimental data as a proof of concept, illustrating the recruitment of LAT to the T-cell immunological synapse in data acquired by iPALM providing ~10 nm isotropic resolution.Introduction.
Journal articleAryaman J, hoitzing H, burgstaller J, et al., 2017,
Mitochondrial heterogeneity, metabolic scaling and cell death, Bioessays, Vol: 39, ISSN: 1521-1878
Heterogeneity in mitochondrial content has been previously suggested as a major contributor to cellular noise, with multiple studies indicating its direct involvement in biomedically important cellular phenomena. A recently published dataset explored the connection between mitochondrial functionality and cell physiology, where a non-linearity between mitochondrial functionality and cell size was found. Using mathematical models, we suggest that a combination of metabolic scaling and a simple model of cell death may account for these observations. However, our findings also suggest the existence of alternative competing hypotheses, such as a non-linearity between cell death and cell size. While we find that the proposed non-linear coupling between mitochondrial functionality and cell size provides a compelling alternative to previous attempts to link mitochondrial heterogeneity and cell physiology, we emphasise the need to account for alternative causal variables, including cell cycle, size, mitochondrial density and death, in future studies of mitochondrial physiology.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.