230 results found
Stadler T, Pybus OG, Stumpf MPH, 2021, Phylodynamics for cell biologists., Science, Vol: 371
Multicellular organisms are composed of cells connected by ancestry and descent from progenitor cells. The dynamics of cell birth, death, and inheritance within an organism give rise to the fundamental processes of development, differentiation, and cancer. Technical advances in molecular biology now allow us to study cellular composition, ancestry, and evolution at the resolution of individual cells within an organism or tissue. Here, we take a phylogenetic and phylodynamic approach to single-cell biology. We explain how "tree thinking" is important to the interpretation of the growing body of cell-level data and how ecological null models can benefit statistical hypothesis testing. Experimental progress in cell biology should be accompanied by theoretical developments if we are to exploit fully the dynamical information in single-cell data.
Guillemin A, Roesch E, Stumpf MPH, 2021, Uncertainty in cell fate decision making: Lessons from potential landscapes of bifurcation systems
<jats:title>Abstract</jats:title><jats:p>Cell fate decision making is known to be a complex process and is still far from being understood. The intrinsic complexity, but also features such as molecular noise represent challenges for modelling these systems. Waddington’s epigenetic landscape has become the overriding metaphor for developmental processes: it both serves as pictorial representation, and can be related to mathematical models. In this work we investigate how the landscape is affected by noise in the underlying system. Specifically, we focus on those systems where minor changes in the parameters cause major changes in the stability properties of the system, especially bifurcations. We analyse and quantify the changes in the landscape’s shape as the effects of noise increase. We find ample evidence for intricate interplay between noise and dynamics which can lead to qualitative change in a system’s dynamics and hence the corresponding landscape. In particular, we find that the effects can be most pronounced in the vicinity of the bifurcation point of the underlying deterministic dynamical systems, which would correspond to the cell fate decision event in cellular differentiation processes.</jats:p>
Guillemin A, Stumpf MPH, 2021, Noise and the molecular processes underlying cell fate decision-making, PHYSICAL BIOLOGY, Vol: 18, ISSN: 1478-3967
Coomer MA, Ham L, Stumpf MPH, 2020, Shaping the Epigenetic Landscape: Complexities and Consequences
<jats:title>Abstract</jats:title><jats:p>The metaphor of the Waddington epigenetic landscape has become an iconic representation of the cellular differentiation process. Recent accessibility of single-cell transcriptomic data has provided new opportunities for quantifying this originally conceptual tool that could offer insight into the gene regulatory networks underlying cellular development. While a number of methods for constructing the landscape have been proposed, by far the most commonly employed approach is based on computing the landscape as the negative logarithm of the steady-state probability distribution. Here, we use a simple model to highlight the complexities and limitations that arise when reconstructing the potential landscape in the presence of stochastic fluctuations. We consider how the landscape changes in accordance with different stochastic systems, and show that it is the subtle interplay between the deterministic and stochastic components of the system that ultimately shapes the landscape. We further discuss how the presence of noise has important implications for the identifiability of the regulatory dynamics from experimental data.</jats:p>
Stumpf MPH, 2020, Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds, JOURNAL OF THE ROYAL SOCIETY INTERFACE, Vol: 17, ISSN: 1742-5689
Stumpf MPH, 2020, Multi-model and network inference based on ensemble estimates: Avoiding the madness of crowds: Multi-model and network inference based on ensemble estimates: Avoiding the madness of crowds, Journal of the Royal Society Interface, Vol: 17, ISSN: 1742-5689
Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number - typically less than 10 - of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble - choosing good predictors - is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.
Ham L, Jackson M, Stumpf MPH, 2020, Pathway dynamics can delineate the sources of transcriptional noise in gene expression
<jats:p>Single-cell expression profiling has opened up new vistas on cellular processes. Among other important results, one stand-out observation has been the confirmation of extensive cell-to-cell variability at the transcriptomic and proteomic level. Because most experimental analyses are destructive we only have access to snapshot data of cellular states. This loss of temporal information presents significant challenges in inferring dynamics, as well as causes of cell-to-cell variability. In particular, we are typically unable to separate dynamic variability from within individual systems (“intrinsic noise”) from variability across the population (“extrinsic noise”). Here we mathematically formalise this non-identifiability; but we also use this to identify how new experimental set-ups coupled to statistical noise decomposition can resolve this non-identifiability. For single-cell transcriptomic data we find that systems subject to population variation invariably inflate the apparent degree of burstiness of the underlying process. Such identifiability problems can, in principle, be remedied by dual-reporter assays, which separates total gene expression noise into intrinsic and extrinsic contributions; unfortunately, however, this requires pairs of strictly independent and identical gene reporters to be integrated into the same cell, which is difficult to implement experimentally in most systems. Here we demonstrate mathematically that, in some cases decomposition of transcriptional noise is possible with non-identical and not-necessarily independent reporters. We use our result to show that generic reporters lying in the same biochemical pathways (e.g. mRNA and protein) can replace dual reporters, enabling the noise decomposition to be obtained from only a single gene. Stochastic simulations are used to support our theory, and show that our “pathway-reporter” method compares favourably to the dual-reporter method.</jat
Diaz LPM, Stumpf MPH, 2020, Gaining confidence in inferred networks
<jats:title>Abstract</jats:title><jats:p>Network inference is a notoriously challenging problem. Inferred networks are associated with high uncertainty and likely riddled with false positive and false negative interactions. Especially for biological networks we do not have good ways of judging the performance of inference methods against real networks, and instead we often rely solely on the performance against simulated data. Gaining confidence in networks inferred from real data nevertheless thus requires establishing reliable validation methods. Here, we argue that the expectation of mixing patterns in biological networks such as gene regulatory networks offers a reasonable starting point: interactions are more likely to occur between nodes with similar biological functions. We can quantify this behaviour using the assortativity coefficient, and here we show that the resulting heuristic,<jats:italic>functional assortativity</jats:italic>, offers a reliable and informative route for comparing different inference algorithms.</jats:p>
Croydon Veleslavov IA, Stumpf MPH, 2020, Repeated Decision Stumping Distils Simple Rules from Single Cell Data
<jats:title>Abstract</jats:title><jats:p>Here we introduce repeated decision stumping, to distill simple models from single cell data. We develop decision trees of depth one – hence ‘stumps’ – to identify in an inductive manner, gene products involved in driving cell fate transitions, and in applications to published data we are able to discover the key-players involved in these processes in an unbiased manner without prior knowledge. The approach is computationally efficient, has remarkable predictive power, and yields robust and statistically stable predictors: the same set of candidates is generated by applying the algorithm to different subsamples of the data.</jats:p>
He F, Stumpf M, Kleijn I, et al., 2020, GpABC: a Julia package for approximate Bayesian computation with Gaussian process emulation, Bioinformatics, Vol: 36, Pages: 3286-3287, ISSN: 1367-4803
MotivationApproximate Bayesian computation (ABC) is an important framework within which to infer the structure and parameters of a systems biology model. It is especially suitable for biological systems with stochastic and nonlinear dynamics, for which the likelihood functions are intractable. However, the associated computational cost often limits ABC to models that are relatively quick to simulate in practice.ResultsWe here present a Julia package, GpABC, that implements parameter inference and model selection for deterministic or stochastic models using i) standard rejection ABC or ABC-SMC, or ii) ABC with Gaussian process emulation. The latter significantly reduces the computational cost.Availability and Implementationhttps://github.com/tanhevg/GpABC.jlSupplementary informationSupplementary data are available at Bioinformatics online.
Ham L, Schnoerr D, Brackston RD, et al., 2020, Exactly solvable models of stochastic gene expression, JOURNAL OF CHEMICAL PHYSICS, Vol: 152, ISSN: 0021-9606
Ham L, Brackston R, Stumpf MPH, 2020, Extrinsic Noise and Heavy-Tailed Laws in Gene Expression, PHYSICAL REVIEW LETTERS, Vol: 124, ISSN: 0031-9007
Ham L, Schnoerr D, Brackston RD, et al., 2020, Exactly solvable models of stochastic gene expression
<jats:p>Stochastic models are key to understanding the intricate dynamics of gene expression. But the simplest models which only account for e.g. active and inactive states of a gene fail to capture common observations in both prokaryotic and eukaryotic organisms. Here we consider multistate models of gene expression which generalise the canonical Telegraph process, and are capable of capturing the joint effects of e.g. transcription factors, heterochromatin state and DNA accessibility (or, in prokaryotes, Sigma-factor activity) on transcript abundance. We propose two approaches for solving classes of these generalised systems. The first approach offers a fresh perspective on a general class of multistate models, and allows us to “decompose” more complicated systems into simpler processes, each of which can be solved analytically. This enables us to obtain a solution of any model from this class. We further show that these models cannot have a heavy-tailed distribution in the absence of extrinsic noise. Next, we develop an approximation method based on a power series expansion of the stationary distribution for an even broader class of multistate models of gene transcription. The combination of analytical and computational solutions for these realistic gene expression models also holds the potential to design synthetic systems, and control the behaviour of naturally evolved gene expression systems, e.g. in guiding cell-fate decisions.</jats:p>
<jats:title>Abstract</jats:title><jats:p>Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare quantitatively the performance of different candidate models at describing a particular biological system. Model selection has been applied with great success to problems where a small number — typically less than 10 — of models are compared, but recently studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggests that the careful construction of an ensemble – choosing good predictors – is of paramount importance, more than had perhaps been realised before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.</jats:p>
Scholes NS, Schnoerr D, Isalan M, et al., 2019, A Comprehensive Network Atlas Reveals That Turing Patterns Are Common but Not Robust, CELL SYSTEMS, Vol: 9, Pages: 515-517, ISSN: 2405-4712
Roesch E, Stumpf MPH, 2019, Parameter inference in dynamical systems with co-dimension 1 bifurcations, ROYAL SOCIETY OPEN SCIENCE, Vol: 6, ISSN: 2054-5703
Scholes N, Schnoerr D, Isalan M, et al., 2019, A comprehensive network atlas reveals that Turing patterns are common but not robust, Cell Systems, Vol: 9, Pages: 243-257.e4, ISSN: 2405-4712
Turing patterns (TPs) underlie many fundamental developmental processes, but they operate over narrow parameter ranges, raising the conundrum of how evolution can ever discover them. Here we explore TP design space to address this question and to distill design rules. We exhaustively analyze 2- and 3-node biological candidate Turing systems, amounting to 7,625 networks and more than 3 × 10^11 analyzed scenarios. We find that network structure alone neither implies nor guarantees emergent TPs. A large fraction (>61%) of network design space can produce TPs, but these are sensitive to even subtle changes in parameters, network structure, and regulatory mechanisms. This implies that TP networks are more common than previously thought, and evolution might regularly encounter prototypic solutions. We deduce compositional rules for TP systems that are almost necessary and sufficient (96% of TP networks contain them, and 92% of networks implementing them produce TPs). This comprehensive network atlas provides the blueprints for identifying natural TPs and for engineering synthetic systems.
Stumpf M, 2019, LislPisl/Bifurcations: First release of Bifurcations
LislPisl/Bifurcations: First release of Bifurcations
He F, Stumpf MPH, 2019, Quantifying Dynamic Regulation in Metabolic Pathways with Nonparametric Flux Inference, BIOPHYSICAL JOURNAL, Vol: 116, Pages: 2035-2046, ISSN: 0006-3495
Chan TE, Stumpf MPH, Babtie AC, 2019, Gene Regulatory Networks from Single Cell Data for Exploring Cell Fate Decisions., Methods Mol Biol, Vol: 1975, Pages: 211-238
Single cell experimental techniques now allow us to quantify gene expression in up to thousands of individual cells. These data reveal the changes in transcriptional state that occur as cells progress through development and adopt specialized cell fates. In this chapter we describe in detail how to use our network inference algorithm (PIDC)-and the associated software package NetworkInference.jl-to infer functional interactions between genes from the observed gene expression patterns. We exploit the large sample sizes and inherent variability of single cell data to detect statistical dependencies between genes that indicate putative (co-)regulatory relationships, using multivariate information measures that can capture complex statistical relationships. We provide guidelines on how best to combine this analysis with other complementary methods designed to explore single cell data, and how to interpret the resulting gene regulatory network models to gain insight into the processes regulating cell differentiation.
Roesch E, Stumpf MPH, 2019, Parameter inference in dynamical systems with co-dimension 1 bifurcations
<jats:title>Abstract</jats:title><jats:p>Dynamical systems with intricate behaviour are all-pervasive in biology. Many of the most interesting biological processes indicate the presence of bifurcations, i.e. phenomena where a small change in a system parameter causes qualitatively different behaviour. Bifurcation theory has become a rich field of research in its own right and evaluating the bifurcation behaviour of a given dynamical system can be challenging. An even greater challenge, however, is to learn the bifurcation structure of dynamical systems from data, where the precise model structure is not known. Here we study one aspects of this problem: the practical implications that the presence of bifurcations has on our ability to infer model parameters and initial conditions from empirical data; we focus on the canonical co-dimension 1 bifurcations and provide a comprehensive analysis of how dynamics, and our ability to infer kinetic parameters are linked. The picture thus emerging is surprisingly nuanced and suggests that identification of the qualitative dynamics — the bifurcation diagram — should precede any attempt at inferring kinetic parameters.</jats:p>
Ham L, Brackston RD, Stumpf MPH, 2019, Extrinsic noise and heavy-tailed laws in gene expression
<jats:title>Abstract</jats:title><jats:p>Noise in gene expression is one of the hallmarks of life at the molecular scale. Here we derive analytical solutions to a set of models describing the molecular mechanisms underlying transcription of DNA into RNA. Our <jats:italic>Ansatz</jats:italic> allows us to incorporate the effects of extrinsic noise – encompassing factors external to the transcription of the individual gene – and discuss the ramifications for heterogeneity in gene product abundance that has been widely observed in single cell data. Crucially, we are able to show that heavy-tailed distributions of RNA copy numbers cannot result from the intrinsic stochasticity in gene expression alone, but must instead reflect extrinsic sources of variability.</jats:p>
Dony L, He F, Stumpf MPH, 2019, Parametric and non-parametric gradient matching for network inference: a comparison, BMC BIOINFORMATICS, Vol: 20, ISSN: 1471-2105
Jetka T, Nienałtowski K, Filippi S, et al., 2018, An information-theoretic framework for deciphering pleiotropic and noisy biochemical signaling, Nature Communications, Vol: 9, ISSN: 2041-1723
Many components of signaling pathways are functionally pleiotropic, and signaling responses are marked with substantial cell-to-cell heterogeneity. Therefore, biochemical descriptions of signaling require quantitative support to explain how complex stimuli (inputs) are encoded in distinct activities of pathways effectors (outputs). A unique perspective of information theory cannot be fully utilized due to lack of modeling tools that account for the complexity of biochemical signaling, specifically for multiple inputs and outputs. Here, we develop a modeling framework of information theory that allows for efficient analysis of models with multiple inputs and outputs; accounts for temporal dynamics of signaling; enables analysis of how signals flow through shared network components; and is not restricted by limited variability of responses. The framework allows us to explain how identity and quantity of type I and type III interferon variants could be recognized by cells despite activating the same signaling effectors.
Brackston R, Lakatos E, Stumpf MPH, 2018, Transition state characteristics during cell differentiation, PLoS Computational Biology, Vol: 14, Pages: 1-24, ISSN: 1553-734X
Models describing the process of stem-cell differentiation are plentiful, and may offer insights into the underlying mechanisms and experimentally observed behaviour. Waddington’s epigenetic landscape has been providing a conceptual framework for differentiation processes since its inception. It also allows, however, for detailed mathematical and quantitative analyses, as the landscape can, at least in principle, be related to mathematical models of dynamical systems. Here we focus on a set of dynamical systems features that are intimately linked to cell differentiation, by considering exemplar dynamical models that capture important aspects of stem cell differentiation dynamics. These models allow us to map the paths that cells take through gene expression space as they move from one fate to another, e.g. from a stem-cell to a more specialized cell type. Our analysis highlights the role of the transition state (TS) that separates distinct cell fates, and how the nature of the TS changes as the underlying landscape changes—change that can be induced by e.g. cellular signaling. We demonstrate that models for stem cell differentiation may be interpreted in terms of either a static or transitory landscape. For the static case the TS represents a particular transcriptional profile that all cells approach during differentiation. Alternatively, the TS may refer to the commonly observed period of heterogeneity as cells undergo stochastic transitions.
Brackston R, Wynn A, Stumpf MPH, 2018, Construction of quasi-potentials for stochastic dynamical systems: An optimization approach, Physical Review E, Vol: 98, ISSN: 1539-3755
The construction of effective and informative landscapes for stochastic dynamical systems has proven a long-standing and complex problem. In many situations, the dynamics may be described by a Langevin equation while constructing a landscape comes down to obtaining the quasipotential, a scalar function that quantifies the likelihood of reaching each point in the state space. In this work we provide a novel method for constructing such landscapes by extending a tool from control theory: the sum-of-squares method for generating Lyapunov functions. Applicable to any system described by polynomials, this method provides an analytical polynomial expression for the potential landscape, in which the coefficients of the polynomial are obtained via a convex optimization problem. The resulting landscapes are based on a decomposition of the deterministic dynamics of the original system, formed in terms of the gradient of the potential and a remaining “curl” component. By satisfying the condition that the inner product of the gradient of the potential and the remaining dynamics is everywhere negative, our derived landscapes provide both upper and lower bounds on the true quasipotential; these bounds becoming tight if the decomposition is orthogonal. The method is demonstrated to correctly compute the quasipotential for high-dimensional linear systems and also for a number of nonlinear examples.
Stumpf MPH, 2018, Biology challenging statistics, STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, Vol: 17, ISSN: 2194-6302
McEwen KR, Linnett S, Leitch HG, et al., 2018, Signalling pathways drive heterogeneity of ground state pluripotency
<jats:title>Abstract</jats:title><jats:p>Pluripotent stem cells (PSCs) can self-renew indefinitely while maintaining the ability to generate all cell types of the body. This plasticity is proposed to require heterogeneity in gene expression, driving a metastable state which may allow flexible cell fate choices. Contrary to this, naive PSC grown in fully defined ‘2i’ environmental conditions, containing small molecule inhibitors of MEK and GSK3 kinases, show homogenous pluripotency and lineage marker expression. However, here we show that 2i induces greater genome-wide heterogeneity than traditional serum-containing growth environments at the population level across both male and female PSCs. This heterogeneity is dynamic and reversible over time, consistent with a dynamic metastable equilibrium of the pluripotent state. We further show that the 2i environment causes increased heterogeneity in the calcium signalling pathway at both the population and single-cell level. Mechanistically, we identify loss of robustness regulators in the form of negative feedback to the upstream EGF receptor. Our findings advance the current understanding of the plastic nature of the pluripotent state and highlight the role of signalling pathways in the control of transcriptional heterogeneity. Furthermore, our results have critical implications for the current use of kinase inhibitors in the clinic, where inducing heterogeneity may increase the risk of cancer metastasis and drug resistance.</jats:p>
Dony L, Mackerodt J, Ward S, et al., 2018, PEITH(Theta): perfecting experiments with information theory in Python with GPU support, Bioinformatics, Vol: 34, Pages: 1249-1250, ISSN: 1367-4803
MotivationDifferent experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, methods that inform us about the information content that a given experiment carries about the question we want to answer, become crucial.ResultsPEITH(Θ) is a general purpose, Python framework for experimental design in systems biology. PEITH(Θ) uses Bayesian inference and information theory in order to derive which experiments are most informative in order to estimate all model parameters and/or perform model predictions.Availability and implementation: https://github.com/MichaelPHStumpf/Peitho
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.