40 results found
NCD Risk Factor Collaboration NCD-RisC, Iurilli N, 2021, Heterogeneous contributions of change in population distribution of body-mass index to change in obesity and underweight, eLife, Vol: 10, ISSN: 2050-084X
From 1985 to 2016, the prevalence of underweight decreased, and that of obesity and severe obesity increased, in most regions, with significant variation in the magnitude of these changes across regions. We investigated how much change in mean body mass index (BMI) explains changes in the prevalence of underweight, obesity, and severe obesity in different regions using data from 2896 population-based studies with 187 million participants. Changes in the prevalence of underweight and total obesity, and to a lesser extent severe obesity, are largely driven by shifts in the distribution of BMI, with smaller contributions from changes in the shape of the distribution. In East and Southeast Asia and sub-Saharan Africa, the underweight tail of the BMI distribution was left behind as the distribution shifted. There is a need for policies that address all forms of malnutrition by making healthy foods accessible and affordable, while restricting unhealthy foods through fiscal and regulatory restrictions.
Unwin H, Mishra S, Bradley V, et al., 2020, State-level tracking of COVID-19 in the United States, Nature Communications, Vol: 11, Pages: 1-9, ISSN: 2041-1723
As of 1st June 2020, the US Centers for Disease Control and Prevention reported 104,232 confirmed or probable COVID-19-related deaths in the US. This was more than twice the number of deaths reported in the next most severely impacted country. We jointly model the US epidemic at the state-level, using publicly available deathdata within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the number of individuals that have been infected, the number of individuals that are currently infectious and the time-varying reproduction number (the average number of secondary infections caused by an infected person). We use changes in mobility to capture the impact that non-pharmaceutical interventions and other behaviour changes have on therate of transmission of SARS-CoV-2. We estimate thatRtwas only below one in 23 states on 1st June. We also estimate that 3.7% [3.4%-4.0%] of the total population of the US had been infected, with wide variation between states, and approximately 0.01% of the population was infectious. We demonstrate good 3 week model forecasts of deaths with low error and good coverage of our credible intervals.
Kolbeinsson A, Filippi S, Panagakis I, et al., 2020, Accelerated MRI-predicted brain ageing and its associations with cardiometabolic and brain disorders, Scientific Reports, Vol: 10, ISSN: 2045-2322
Brain structure in later life reflects both influences of intrinsic aging and those of lifestyle, environment and disease. We developed a deep neural network model trained on brain MRI scans of healthy people to predict “healthy” brain age. Brain regions most informative for the prediction included the cerebellum, hippocampus, amygdala and insular cortex. We then applied this model to data from an independent group of people not stratified for health. A phenome-wide association analysis of over 1,410 traits in the UK Biobank with differences between the predicted and chronological ages for the second group identified significant associations with over 40 traits including diseases (e.g., type I and type II diabetes), disease risk factors (e.g., increased diastolic blood pressure and body mass index), and poorer cognitive function. These observations highlight relationships between brain and systemic health and have implications for understanding contributions of the latter to late life dementia risk.
Roberts G, Fontanella S, Selby A, et al., 2020, Connectivity patterns between multiple allergen specific IgE antibodies and their association with severe asthma, Journal of Allergy and Clinical Immunology, Vol: 146, Pages: 821-830, ISSN: 0091-6749
BACKGROUND: Allergic sensitization is associated with severe asthma, but assessment of sensitization is not recommended by most guidelines. OBJECTIVE: We hypothesized that patterns of IgE responses to multiple allergenic proteins differ between sensitized participants with mild/moderate and severe asthma. METHODS: IgE to 112 allergenic molecules (components, c-sIgE) was measured using multiplex array among 509 adults and 140 school-age and 131 preschool children with asthma/wheeze from the Unbiased BIOmarkers for the PREDiction of respiratory diseases outcomes cohort, of whom 595 had severe disease. We applied clustering methods to identify co-occurrence patterns of components (component clusters) and patterns of sensitization among participants (sensitization clusters). Network analysis techniques explored the connectivity structure of c-sIgE, and differential network analysis looked for differences in c-sIgE interactions between severe and mild/moderate asthma. RESULTS: Four sensitization clusters were identified, but with no difference between disease severity groups. Similarly, component clusters were not associated with asthma severity. None of the c-sIgE were identified as associates of severe asthma. The key difference between school children and adults with mild/moderate compared with those with severe asthma was in the network of connections between c-sIgE. Participants with severe asthma had higher connectivity among components, but these connections were weaker. The mild/moderate network had fewer connections, but the connections were stronger. Connectivity between components with no structural homology tended to co-occur among participants with severe asthma. Results were independent from the different sample sizes of mild/moderate and severe groups. CONCLUSIONS: The patterns of interactions between IgE to multiple allergenic proteins are predictors of asthma severity among school children and adults with allergic asthma.
Monod M, Blenkinsop A, Xi X, et al., 2020, Report 32: Targeting interventions to age groups that sustain COVID-19 transmission in the United States
Following inial declines, in mid 2020, a resurgence in transmission of novel coronavirus disease (COVID-19) has occurred in the United States and parts of Europe. Despite the wide implementaon of non-pharmaceucal inter-venons, it is sll not known how they are impacted by changing contact paerns, age and other demographics. As COVID-19 disease control becomes more localised, understanding the age demographics driving transmission and how these impact the loosening of intervenons such as school reopening is crucial. Considering dynamics for the United States, we analyse aggregated, age-speciﬁc mobility trends from more than 10 million individuals and link these mechaniscally to age-speciﬁc COVID-19 mortality data. In contrast to previous approaches, we link mobility to mortality via age speciﬁc contact paerns and use this rich relaonship to reconstruct accurate trans-mission dynamics. Contrary to anecdotal evidence, we ﬁnd lile support for age-shis in contact and transmission dynamics over me. We esmate that, unl August, 63.4% [60.9%-65.5%] of SARS-CoV-2 infecons in the United States originated from adults aged 20-49, while 1.2% [0.8%-1.8%] originated from children aged 0-9. In areas with connued, community-wide transmission, our transmission model predicts that re-opening kindergartens and el-ementary schools could facilitate spread and lead to considerable excess COVID-19 aributable deaths over a 90-day period. These ﬁndings indicate that targeng intervenons to adults aged 20-49 are an important con-sideraon in halng resurgent epidemics, and prevenng COVID-19-aributable deaths when kindergartens and elementary schools reopen.
Teymur O, Filippi S, 2020, A Bayesian nonparametric test for conditional independence, Foundations of Data Science, Vol: 2, Pages: 155-172, ISSN: 2639-8001
This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Pólya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed in existing procedures of this type.
Lamprinakou S, McCoy E, Barahona M, et al., 2020, BART-based inference for Poisson processes
The effectiveness of Bayesian Additive Regression Trees (BART) has beendemonstrated in a variety of contexts including non parametric regression andclassification. Here we introduce a BART scheme for estimating the intensity ofinhomogeneous Poisson Processes. Poisson intensity estimation is a vital taskin various applications including medical imaging, astrophysics and networktraffic analysis. Our approach enables full posterior inference of theintensity in a nonparametric regression setting. We demonstrate the performanceof our scheme through simulation studies on synthetic and real datasets in oneand two dimensions, and compare our approach to alternative approaches.
Yeung E, McFann S, Marsh L, et al., 2020, Inference of multisite phosphorylation rate constants and their modulation by pathogenic mutations, Current Biology, ISSN: 0960-9822
Biomarkers are essential to determine different phenotypes of childhood asthma, andfor the prediction of response to treatments. In young preschool children with asthma,aeroallergen sensitization, and blood eosinophil count of 300/µL or greater may identifythose who can benefit from the daily use of inhaled corticosteroids (ICS). We proposethat every preschool child who is considered for ICS treatment should have these twofeatures measured as a minimum before a decision is made on the commencementof long-term preventive treatment. In practice, IgE-mediated sensitization should beconsidered as a quantifiable variable, i.e., we should use the titer of sIgE antibodies orthe size of skin prick test response. A number of other blood biomarkers may proveuseful (e.g., allergen-specific IgG/IgE antibody ratios amongst sensitized individuals,component-resolved diagnostics which measures sIgE response to a large number ofallergenic molecules, assessment of immune responses to viruses, level of serum CC16,etc.), but it remains unclear whether these can be translated into clinically useful tests.Going forward, a more integrated approach which takes into account multiple domainsof asthma, from the pattern of symptoms and blood biomarkers to genetic risk andlung function measures, is needed if we are to move toward a stratified approach toasthma management.
Jetka T, Nienałtowski K, Filippi S, et al., 2018, An information-theoretic framework for deciphering pleiotropic and noisy biochemical signaling, Nature Communications, Vol: 9, ISSN: 2041-1723
Many components of signaling pathways are functionally pleiotropic, and signaling responses are marked with substantial cell-to-cell heterogeneity. Therefore, biochemical descriptions of signaling require quantitative support to explain how complex stimuli (inputs) are encoded in distinct activities of pathways effectors (outputs). A unique perspective of information theory cannot be fully utilized due to lack of modeling tools that account for the complexity of biochemical signaling, specifically for multiple inputs and outputs. Here, we develop a modeling framework of information theory that allows for efficient analysis of models with multiple inputs and outputs; accounts for temporal dynamics of signaling; enables analysis of how signals flow through shared network components; and is not restricted by limited variability of responses. The framework allows us to explain how identity and quantity of type I and type III interferon variants could be recognized by cells despite activating the same signaling effectors.
Filippi SL, Muraro D, Parker A, et al., 2018, Chronic TNFα-driven injury delays cell migration to villi in the intestinal epithelium, Journal of the Royal Society Interface, Vol: 15, ISSN: 1742-5662
The intestinal epithelium is a single layer of cells which provides the first line of defence of the intestinal mucosa to bacterial infection. Cohesion of this physical barrier is supported by renewal of epithelial stem cells, residing in invaginations called crypts, and by crypt cell migration onto protrusions called villi; dysregulation of such mechanisms may render the gut susceptible to chronic inflammation. The impact that excessive or misplaced epithelial cell death may have on villus cell migration is currently unknown. We integrated cell-tracking methods with computational models to determine how epithelial homeostasis is affected by acute and chronic TNFα-driven epithelial cell death. Parameter inference reveals that acute inflammatory cell death has a transient effect on epithelial cell dynamics, whereas cell death caused by chronic elevated TNFα causes a delay in the accumulation of labelled cells onto the villus compared to the control. Such a delay may be reproduced by using a cell-based model to simulate the dynamics of each cell in a crypt–villus geometry, showing that a prolonged increase in cell death slows the migration of cells from the crypt to the villus. This investigation highlights which injuries (acute or chronic) may be regenerated and which cause disruption of healthy epithelial homeostasis.
Dony L, Mackerodt J, Ward S, et al., 2018, PEITH(Theta): perfecting experiments with information theory in Python with GPU support, Bioinformatics, Vol: 34, Pages: 1249-1250, ISSN: 1367-4803
MotivationDifferent experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, methods that inform us about the information content that a given experiment carries about the question we want to answer, become crucial.ResultsPEITH(Θ) is a general purpose, Python framework for experimental design in systems biology. PEITH(Θ) uses Bayesian inference and information theory in order to derive which experiments are most informative in order to estimate all model parameters and/or perform model predictions.Availability and implementation: https://github.com/MichaelPHStumpf/Peitho
Filippi S, Holmes C, 2017, A Bayesian nonparametric approach to testing for dependence between random variables, Bayesian Analysis, Vol: 12, Pages: 919-938, ISSN: 1931-6690
Nonparametric and nonlinear measures of statistical dependence between pairsof random variables are important tools in modern data analysis. In particularthe emergence of large data sets can now support the relaxation of linearityassumptions implicit in traditional association scores such as correlation.Here we describe a Bayesian nonparametric procedure that leads to a tractable,explicit and analytic quantification of the relative evidence for dependence vsindependence. Our approach uses Polya tree priors on the space of probabilitymeasures which can then be embedded within a decision theoretic test fordependence. Polya tree priors can accommodate known uncertainty in the form ofthe underlying sampling distribution and provides an explicit posteriorprobability measure of both dependence and independence. Well known advantagesof having an explicit probability measure include: easy comparison of evidenceacross different studies; encoding prior information; quantifying changes independence across different experimental conditions, and; the integration ofresults within formal decision analysis.
Smith RCG, Stumpf PS, Ridden SJ, et al., 2017, The problem of measurement in cell biology: a tale of two alleles, European Biophysics Journal with Biophysics Letters, Vol: 46, Pages: S371-S371, ISSN: 0175-7571
Smith RCG, Stumpf PS, Ridden SJ, et al., 2017, Nanog fluctuations in embryonic stem cells highlight the problem of Measurement in cell biology, Biophysical Journal, Vol: 112, Pages: 2641-2652, ISSN: 1542-0086
A number of important pluripotency regulators, including the transcription factor Nanog, are observed to fluctuate stochastically in individual embryonic stem cells. By transiently priming cells for commitment to different lineages, these fluctuations are thought to be important to the maintenance of, and exit from, pluripotency. However, because temporal changes in intracellular protein abundances cannot be measured directly in live cells, fluctuations are typically assessed using genetically engineered reporter cell lines that produce a fluorescent signal as a proxy for protein expression. Here, using a combination of mathematical modeling and experiment, we show that there are unforeseen ways in which widely used reporter strategies can systematically disturb the dynamics they are intended to monitor, sometimes giving profoundly misleading results. In the case of Nanog, we show how genetic reporters can compromise the behavior of important pluripotency-sustaining positive feedback loops, and induce a bifurcation in the underlying dynamics that gives rise to heterogeneous Nanog expression patterns in reporter cell lines that are not representative of the wild-type. These findings help explain the range of published observations of Nanog variability and highlight the problem of measurement in live cells.
Zhang Q, Filippi SL, Flaxman S, et al., 2017, Feature-to-feature regression for a two-step conditional independence test, Uncertainty in Artificial Intelligence
The algorithms for causal discovery and morebroadly for learning the structure of graphicalmodels require well calibrated and consistentconditional independence (CI) tests. We revisitthe CI tests which are based on two-step proceduresand involve regression with subsequent(unconditional) independence test (RESIT) onregression residuals and investigate the assumptionsunder which these tests operate. In particular,we demonstrate that when going beyond simplefunctional relationships with additive noise,such tests can lead to an inflated number of falsediscoveries. We study the relationship of thesetests with those based on dependence measuresusing reproducing kernel Hilbert spaces (RKHS)and propose an extension of RESIT which usesRKHS-valued regression. The resulting test inheritsthe simple two-step testing procedure ofRESIT, while giving correct Type I control andcompetitive power. When used as a componentof the PC algorithm, the proposed test is morerobust to the case where hidden variables inducea switching behaviour in the associations presentin the data.
Zhang Q, Filippi S, Gretton A, et al., 2017, Large-Scale Kernel Methods for Independence Testing, Statistics and Computing, Vol: 28, Pages: 113-130, ISSN: 1573-1375
Representations of probability measures in reproducing kernel Hilbert spacesprovide a flexible framework for fully nonparametric hypothesis tests ofindependence, which can capture any type of departure from independence,including nonlinear associations and multivariate interactions. However, theseapproaches come with an at least quadratic computational cost in the number ofobservations, which can be prohibitive in many applications. Arguably, it isexactly in such large-scale datasets that capturing any type of dependence isof interest, so striking a favourable tradeoff between computational efficiencyand test performance for kernel independence tests would have a direct impacton their applicability in practice. In this contribution, we provide anextensive study of the use of large-scale kernel approximations in the contextof independence testing, contrasting block-based, Nystrom and random Fourierfeature approaches. Through a variety of synthetic data experiments, it isdemonstrated that our novel large scale methods give comparable performancewith existing methods whilst using significantly less computation time andmemory.
Wills QF, Mellado-Gomez E, Nolan R, et al., 2017, The nature and nurture of cell heterogeneity: accounting for macrophage gene-environment interactions with single-cell RNA-Seq., BMC Genomics, Vol: 18, ISSN: 1471-2164
BACKGROUND: Single-cell RNA-Seq can be a valuable and unbiased tool to dissect cellular heterogeneity, despite the transcriptome's limitations in describing higher functional phenotypes and protein events. Perhaps the most important shortfall with transcriptomic 'snapshots' of cell populations is that they risk being descriptive, only cataloging heterogeneity at one point in time, and without microenvironmental context. Studying the genetic ('nature') and environmental ('nurture') modifiers of heterogeneity, and how cell population dynamics unfold over time in response to these modifiers is key when studying highly plastic cells such as macrophages. RESULTS: We introduce the programmable Polaris™ microfluidic lab-on-chip for single-cell sequencing, which performs live-cell imaging while controlling for the culture microenvironment of each cell. Using gene-edited macrophages we demonstrate how previously unappreciated knockout effects of SAMHD1, such as an altered oxidative stress response, have a large paracrine signaling component. Furthermore, we demonstrate single-cell pathway enrichments for cell cycle arrest and APOBEC3G degradation, both associated with the oxidative stress response and altered proteostasis. Interestingly, SAMHD1 and APOBEC3G are both HIV-1 inhibitors ('restriction factors'), with no known co-regulation. CONCLUSION: As single-cell methods continue to mature, so will the ability to move beyond simple 'snapshots' of cell populations towards studying the determinants of population dynamics. By combining single-cell culture, live-cell imaging, and single-cell sequencing, we have demonstrated the ability to study cell phenotypes and microenvironmental influences. It's these microenvironmental components - ignored by standard single-cell workflows - that likely determine how macrophages, for example, react to inflammation and form treatment resistant HIV reservoirs.
Filippi S, Holmes CC, Nieto-Barajas LE, 2016, Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures, Electronic Journal of Statistics, Vol: 10, Pages: 3338-3354, ISSN: 1935-7524
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a “null model” of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.
Flaxman S, Sejdinovic D, Cunningham JP, et al., 2016, Bayesian Learning of Kernel Embeddings, UAI'16
Kernel methods are one of the mainstays of machine learning, but the problemof kernel learning remains challenging, with only a few heuristics and verylittle theory. This is of particular importance in methods based on estimationof kernel mean embeddings of probability measures. For characteristic kernels,which include most commonly used ones, the kernel mean embedding uniquelydetermines its probability measure, so it can be used to design a powerfulstatistical testing framework, which includes nonparametric two-sample andindependence tests. In practice, however, the performance of these tests can bevery sensitive to the choice of kernel and its lengthscale parameters. Toaddress this central issue, we propose a new probabilistic model for kernelmean embeddings, the Bayesian Kernel Embedding model, combining a Gaussianprocess prior over the Reproducing Kernel Hilbert Space containing the meanembedding with a conjugate likelihood function, thus yielding a closed formposterior over the mean embedding. The posterior mean of our model is closelyrelated to recently proposed shrinkage estimators for kernel mean embeddings,while the posterior uncertainty is a new, interesting feature with variouspossible applications. Critically for the purposes of kernel learning, ourmodel gives a simple, closed form marginal pseudolikelihood of the observeddata given the kernel hyperparameters. This marginal pseudolikelihood caneither be optimized to inform the hyperparameter choice or fully Bayesianinference can be used.
Filippi S, Barnes CP, Kirk PDW, et al., 2016, Robustness of MEK-ERK Dynamics and Origins of Cell-to-Cell Variability in MAPK Signaling, CellReports
Mahon SSM, Lenive O, Filippi S, et al., 2015, Information processing by simple molecular motifs and susceptibility to noise, Journal of The Royal Society Interface
Bhatnagar N, Perkins K, Filippi S, et al., 2014, Clinical and Hematologic Impact of Fetal and Perinatal Variables on Mutant GATA1 Clone Size in Neonates with Down Syndrome, BLOOD, Vol: 124, ISSN: 0006-4971
Mc Mahon SS, Sim A, Filippi S, et al., 2014, Information theory and signal transduction systems: From molecular information processing to network inference, SEMINARS IN CELL & DEVELOPMENTAL BIOLOGY, Vol: 35, Pages: 98-108, ISSN: 1084-9521
MacLean AL, Filippi S, Stumpf MPH, 2014, The ecology in the hematopoietic stem cell niche determines the clinical outcome in chronic myeloid leukemia, PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, Vol: 111, Pages: 3883-3888, ISSN: 0027-8424
Liepe J, Kirk P, Filippi S, et al., 2014, A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation, NATURE PROTOCOLS, Vol: 9, Pages: 439-456, ISSN: 1754-2189
Silk D, Filippi S, Stumpf MPH, 2013, Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems, Statistical Applications in Genetics and Molecular Biology, Vol: 12, Pages: 603-618, ISSN: 2194-6302
The likelihood–free sequential Approximate Bayesian Computation (ABC) algorithms are increasingly popular inference tools for complex biological models. Such algorithms proceed by constructing a succession of probability distributions over the parameter space conditional upon the simulated data lying in an ε–ball around the observed data, for decreasing values of the threshold ε. While in theory, the distributions (starting from a suitably defined prior) will converge towards the unknown posterior as ε tends to zero, the exact sequence of thresholds can impact upon the computational efficiency and success of a particular application. In particular, we show here that the current preferred method of choosing thresholds as a pre-determined quantile of the distances between simulated and observed data from the previous population, can lead to the inferred posterior distribution being very different to the true posterior. Threshold selection thus remains an important challenge. Here we propose that the threshold–acceptance rate curve may be used to determine threshold schedules that avoid local optima, while balancing the need to minimise the threshold with computational efficiency. Furthermore, we provide an algorithm based upon the unscented transform, that enables the threshold–acceptance rate curve to be efficiently predicted in the case of deterministic and stochastic state space models.
Filippi S, Barnes CP, Cornebise J, et al., 2013, On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo, STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, Vol: 12, ISSN: 2194-6302
Silk D, Filippi S, Stumpf MPH, 2013, Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems, STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, Vol: 12, Pages: 603-618, ISSN: 2194-6302
Liepe J, Filippi S, Komorowski ML, et al., 2013, Maximizing the Information Content of Experiments in Systems Biology, PLoS computational biology
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.