Most of the members of this group are from the Statistics Section and Biomaths research group of the Department of Mathematics. Below you can find a list of research areas that members of this group are currently working on and/or would like to work on by applying their developed mathematical and statistical methods.

Research areas

Research areas


Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Journal article
    Myall AC, Peach RL, Weiße AY, Davies F, Mookerjee S, Holmes A, Barahona Met al., 2021,

    Network memory in the movement of hospital patients carrying drug-resistant bacteria

    , Applied Network Science, Vol: 6, ISSN: 2364-8228

    Hospitals constitute highly interconnected systems that bring into contact anabundance of infectious pathogens and susceptible individuals, thus makinginfection outbreaks both common and challenging. In recent years, there hasbeen a sharp incidence of antimicrobial-resistance amongsthealthcare-associated infections, a situation now considered endemic in manycountries. Here we present network-based analyses of a data set capturing themovement of patients harbouring drug-resistant bacteria across three largeLondon hospitals. We show that there are substantial memory effects in themovement of hospital patients colonised with drug-resistant bacteria. Suchmemory effects break first-order Markovian transitive assumptions andsubstantially alter the conclusions from the analysis, specifically on noderankings and the evolution of diffusive processes. We capture variable lengthmemory effects by constructing a lumped-state memory network, which we then useto identify overlapping communities of wards. We find that these communities ofwards display a quasi-hierarchical structure at different levels of granularitywhich is consistent with different aspects of patient flows related to hospitallocations and medical specialties.

  • Journal article
    Saavedra-Garcia P, Roman-Trufero M, Al-Sadah HA, Blighe K, Lopez-Jimenez E, Christoforou M, Penfold L, Capece D, Xiong X, Miao Y, Parzych K, Caputo V, Siskos AP, Encheva V, Liu Z, Thiel D, Kaiser MF, Piazza P, Chaidos A, Karadimitris A, Franzoso G, Snijder AP, Keun HC, Oyarzún DA, Barahona M, Auner Het al., 2021,

    Systems level profiling of chemotherapy-induced stress resolution in cancer cells reveals druggable trade-offs

    , Proceedings of the National Academy of Sciences of USA, Vol: 118, ISSN: 0027-8424

    Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known.Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stressresolution in multiple myeloma cells recovering from proteasome inhibition. Our observations definelayered and protracted programmes for stress resolution that encompass extensive changes acrossthe transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibitioninvolved protracted and dynamic changes of glucose and lipid metabolism and suppression ofmitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insultsthan acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellularresponse to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptomeanalysis pipeline, we further show that GCN2 is also a stress-independent bona fide target intranscriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus,identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumour cellsmay reveal new therapeutic targets and routes for cancer therapy optimisation.

  • Journal article
    Battey HS, 2019,

    On sparsity scales and covariance matrix transformations

    , Biometrika, Vol: 106, Pages: 605-617, ISSN: 0006-3444

    We develop a theory of covariance and concentration matrix estimation on any given or estimated sparsity scale when the matrix dimension is larger than the sample size. Nonstandard sparsity scales are justified when such matrices are nuisance parameters, distinct from interest parameters, which should always have a direct subject-matter interpretation. The matrix logarithmic and inverse scales are studied as special cases, with the corollary that a constrained optimization-based approach is unnecessary for estimating a sparse concentration matrix. It is shown through simulations that for large unstructured covariance matrices, there can be appreciable advantages to estimating a sparse approximation to the log-transformed covariance matrix and converting the conclusions back to the scale of interest.

  • Journal article
    Clarke JM, Warren LR, Arora S, Barahona M, Darzi AW, Clarke JM, Warren LR, Arora S, Barahona M, Darzi AW, Clarke J, Warren L, Barahona M, Darzi Aet al., 2018,

    Guiding interoperable electronic health records through patient-sharing networks.

    , NPJ digital medicine, Vol: 1, Pages: 65-65, ISSN: 2398-6352

    Effective sharing of clinical information between care providers is a critical component of a safe, efficient health system. National data-sharing systems may be costly, politically contentious and do not reflect local patterns of care delivery. This study examines hospital attendances in England from 2013 to 2015 to identify instances of patient sharing between hospitals. Of 19.6 million patients receiving care from 155 hospital care providers, 130 million presentations were identified. On 14.7 million occasions (12%), patients attended a different hospital to the one they attended on their previous interaction. A network of hospitals was constructed based on the frequency of patient sharing between hospitals which was partitioned using the Louvain algorithm into ten distinct data-sharing communities, improving the continuity of data sharing in such instances from 0 to 65-95%. Locally implemented data-sharing communities of hospitals may achieve effective accessibility of clinical information without a large-scale national interoperable information system.

  • Journal article
    Battey HS, Cox DR, 2018,

    Large numbers of explanatory variables: a probabilistic assessment

    , Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol: 474
  • Journal article
    Battey HS, Fan J, Liu H, Lu J, Zhu Zet al., 2018,

    Distributed testing and estimation in sparse high dimensional models

    , Annals of Statistics, Vol: 46, Pages: 1352-1382, ISSN: 0090-5364

    This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood-based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible. In other words, the resulting estimators have the same inferential efficiencies and estimation rates as an oracle with access to the full sample. Thorough numerical results are provided to back up the theory.

  • Journal article
    Avella M, Battey HS, Fan J, Li Qet al., 2018,

    Robust estimation of high-dimensional covariance and precision matrices

    , Biometrika, Vol: 105, Pages: 271-284, ISSN: 0006-3444

    High-dimensional data are often most plausibly generated from distributions with complex structure and leptokurtosis in some or all components. Covariance and precision matrices provide a useful summary of such structure, yet the performance of popular matrix estimators typically hinges upon a sub-Gaussianity assumption. This paper presents robust matrix estimators whose performance is guaranteed for a much richer class of distributions. The proposed estimators, under a bounded fourth moment assumption, achieve the same minimax convergence rates as do existing methods under a sub-Gaussianity assumption. Consistency of the proposed estimators is also established under the weak assumption of bounded2+ϵmoments forϵ∈(0,2). The associated convergence rates depend onϵ.

  • Journal article
    Aryaman J, Johnston IG, Jones NS, 2017,

    Mitochondrial DNA Density Homeostasis Accounts for a Threshold Effect in a Cybrid Model of a Human Mitochondrial Disease

    , Biochemical Journal, Vol: 474, Pages: 4019-4034, ISSN: 1470-8728

    Mitochondrial dysfunction is involved in a wide array of devastating diseases, but the heterogeneity and complexity of the symptoms of these diseases challenges theoretical understanding of their causation. With the explosion of omics data, we have the unprecedented opportunity to gain deep understanding of the biochemical mechanisms of mitochondrial dysfunction. This goal raises the outstanding need to make these complex datasets interpretable. Quantitative modelling allows us to translate such datasets into intuition and suggest rational biomedical treatments. Taking an interdisciplinary approach, we use a recently published large-scale dataset and develop a descriptive and predictive mathematical model of progressive increase in mutant load of the MELAS 3243A>G mtDNA mutation. The experimentally observed behaviour is surprisingly rich, but we find that our simple, biophysically motivated model intuitively accounts for this heterogeneity and yields a wealth of biological predictions. Our findings suggest that cells attempt to maintain wild-type mtDNA density through cell volume reduction, and thus power demand reduction, until a minimum cell volume is reached. Thereafter, cells toggle from demand reduction to supply increase, up-regulating energy production pathways. Our analysis provides further evidence for the physiological significance of mtDNA density and emphasizes the need for performing single-cell volume measurements jointly with mtDNA quantification. We propose novel experiments to verify the hypotheses made here to further develop our understanding of the threshold effect and connect with rational choices for mtDNA disease therapies.

  • Journal article
    Fulcher B, Jones NS, 2017,

    hctsa: A computational framework for automated timeseriesphenotyping using massive feature extraction

    , Cell Systems, Vol: 5, Pages: 527-531.e3, ISSN: 2405-4712

    Phenotype measurements frequently take the form of time series, but we currently lack a systematic method for relating these complex data streams to scientifically meaningful outcomes, such as relating the movement dynamics of organisms to their genotype or measurements of brain dynamics of a patient to their disease diagnosis. Previous work addressed this problem by comparing implementations of thousands of diverse scientific time-series analysis methods in an approach termed highly comparative time-series analysis. Here, we introduce hctsa, a software tool for applying this methodological approach to data. hctsa includes an architecture for computing over 7,700 time-series features and a suite of analysis and visualization algorithms to automatically select useful and interpretable time-series features for a given application. Using exemplar applications to high-throughput phenotyping experiments, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in time-series data.

  • Journal article
    Cox DR, Battey HS, 2017,

    Large numbers of explanatory variables, a semi-descriptive analysis

    , Proceedings of the National Academy of Sciences of USA, Vol: 114, Pages: 8592-8595, ISSN: 0027-8424

    Data with a relatively small number of study individuals and a very large number of potential explanatory features arise particularly, but by no means only, in genomics. A powerful method of analysis, the lasso [Tibshirani R (1996) J Roy Stat Soc B 58:267–288], takes account of an assumed sparsity of effects, that is, that most of the features are nugatory. Standard criteria for model fitting, such as the method of least squares, are modified by imposing a penalty for each explanatory variable used. There results a single model, leaving open the possibility that other sparse choices of explanatory features fit virtually equally well. The method suggested in this paper aims to specify simple models that are essentially equally effective, leaving detailed interpretation to the specifics of the particular study. The method hinges on the ability to make initially a very large number of separate analyses, allowing each explanatory feature to be assessed in combination with many other such features. Further stages allow the assessment of more complex patterns such as nonlinear and interactive dependences. The method has formal similarities to so-called partially balanced incomplete block designs introduced 80 years ago [Yates F (1936) J Agric Sci 26:424–455] for the study of large-scale plant breeding trials. The emphasis in this paper is strongly on exploratory analysis; the more formal statistical properties obtained under idealized assumptions will be reported separately.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=959&limit=10&respub-action=search.html Current Millis: 1632857012057 Current Time: Tue Sep 28 20:23:32 BST 2021