Imperial College London

DrBarbaraBravi

Faculty of Natural SciencesDepartment of Mathematics

Lecturer in Biomathematics
 
 
 
//

Contact

 

b.bravi21

 
 
//

Location

 

Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

16 results found

Bravi B, 2024, Development and use of machine learning algorithms in vaccine target selection., NPJ Vaccines, Vol: 9

Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.

Journal article

Bravi B, Di Gioacchino A, Fernandez-de-Cossio-Diaz J, Walczak AM, Mora T, Cocco S, Monasson Ret al., 2023, A transfer-learning approach to predict antigen immunogenicity and T-cell receptor specificity., Elife, Vol: 12

Antigen immunogenicity and the specificity of binding of T-cell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on transfer learning and Restricted Boltzmann Machines, to build sequence-based predictive models of these properties. DiffRBM is designed to learn the distinctive patterns in amino-acid composition that, on the one hand, underlie the antigen's probability of triggering a response, and on the other hand the T-cell receptor's ability to bind to a given antigen. We show that the patterns learnt by diffRBM allow us to predict putative contact sites of the antigen-receptor complex. We also discriminate immunogenic and non-immunogenic antigens, antigen-specific and generic receptors, reaching performances that compare favorably to existing sequence-based predictors of antigen immunogenicity and T-cell receptor specificity.

Journal article

Meysman P, Barton J, Bravi B, Cohen-Lavi L, Karnaukhov V, Lilleskov E, Montemurro A, Nielsen M, Mora T, Pereira P, Postovskaya A, Martínez MR, Fernandez-de-Cossio-Diaz J, Vujkovic A, Walczak AM, Weber A, Yin R, Eugster A, Sharma Vet al., 2023, Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report, ImmunoInformatics, Vol: 9, Pages: 1-8, ISSN: 2667-1190

Many different solutions to predicting the cognate epitope target of a T-cell receptor (TCR) have been proposed. However several questions on the advantages and disadvantages of these different approaches remain unresolved, as most methods have only been evaluated within the context of their initial publications and data sets. Here, we report the findings of the first public TCR-epitope prediction benchmark performed on 23 prediction models in the context of the ImmRep 2022 TCR-epitope specificity workshop. This benchmark revealed that the use of paired-chain alpha-beta, as well as CDR1/2 or V/J information, when available, improves classification obtained with CDR3 data, independent of the underlying approach. In addition, we found that straight-forward distance-based approaches can achieve a respectable performance when compared to more complex machine-learning models. Finally, we highlight the need for a truly independent follow-up benchmark and provide recommendations for the design of such a next benchmark.

Journal article

Łuksza M, Sethna ZM, Rojas LA, Lihm J, Bravi B, Elhanati Y, Soares K, Amisaki M, Dobrin A, Hoyos D, Guasp P, Zebboudj A, Yu R, Chandra AK, Waters T, Odgerel Z, Leung J, Kappagantula R, Makohon-Moore A, Johns A, Gill A, Gigoux M, Wolchok J, Merghoub T, Sadelain M, Patterson E, Monasson R, Mora T, Walczak AM, Cocco S, Iacobuzio-Donahue C, Greenbaum BD, Balachandran VPet al., 2022, Neoantigen quality predicts immunoediting in survivors of pancreatic cancer., Nature, Vol: 606, ISSN: 0028-0836

Cancer immunoediting1 is a hallmark of cancer2 that predicts that lymphocytes kill more immunogenic cancer cells to cause less immunogenic clones to dominate a population. Although proven in mice1,3, whether immunoediting occurs naturally in human cancers remains unclear. Here, to address this, we investigate how 70 human pancreatic cancers evolved over 10 years. We find that, despite having more time to accumulate mutations, rare long-term survivors of pancreatic cancer who have stronger T cell activity in primary tumours develop genetically less heterogeneous recurrent tumours with fewer immunogenic mutations (neoantigens). To quantify whether immunoediting underlies these observations, we infer that a neoantigen is immunogenic (high-quality) by two features-'non-selfness'  based on neoantigen similarity to known antigens4,5, and 'selfness'  based on the antigenic distance required for a neoantigen to differentially bind to the MHC or activate a T cell compared with its wild-type peptide. Using these features, we estimate cancer clone fitness as the aggregate cost of T cells recognizing high-quality neoantigens offset by gains from oncogenic mutations. With this model, we predict the clonal evolution of tumours to reveal that long-term survivors of pancreatic cancer develop recurrent tumours with fewer high-quality neoantigens. Thus, we submit evidence that that the human immune system naturally edits neoantigens. Furthermore, we present a model to predict how immune pressure induces cancer cell populations to evolve over time. More broadly, our results argue that the immune system fundamentally surveils host genetic changes to suppress cancer.

Journal article

Bravi B, 2021, il sistema immunitario attraverso la lente dell'inferenza statistica, Ithaca: Viaggio nella Scienza, Vol: 18, ISSN: 2282-8079

Il sistema immunitario è capace di mettere in campo risposte estremamente specifiche che, a livello molecolare, si fondano sul riconoscimento degli agenti patogeni esterni. A seguito del recente boom nelle tecniche di sequenziamento, è divenuto possibile repertoriare nel dettaglio gli insiemi di proteine coinvolte in tale riconoscimento, producendo così risorse senza precedenti per caratterizzarne quantitativamente le proprietà ed il funzionamento. L'obiettivo di questo articolo è tracciare una panoramica di alcuni approcci di modellizzazione del sistema immunitario che sono basati sui dati di sequenziamento e che uniscono, al potere esplorativo e predittivo dell'apprendimento statistico, l'interpretabilità dei modelli di meccanica statistica. Se da un lato il fine primario di questi approcci è stabilire un quadro di comprensione teorica dei meccanismi di risposta immunitaria a livello microscopico, dall'altro le loro predizioni dimostrano importanti risvolti applicativi nello sviluppo dei vaccini e dell'immunoterapia.

Journal article

Bravi B, Balachandran VP, Greenbaum BD, Walczak AM, Mora T, Monasson R, Cocco Set al., 2021, Probing T-cell response by sequence-based probabilistic modeling, PLoS Computational Biology, Vol: 17, Pages: 1-27, ISSN: 1553-734X

With the increasing ability to use high-throughput next-generation sequencing to quantify the diversity of the human T cell receptor (TCR) repertoire, the ability to use TCR sequences to infer antigen-specificity could greatly aid potential diagnostics and therapeutics. Here, we use a machine-learning approach known as Restricted Boltzmann Machine to develop a sequence-based inference approach to identify antigen-specific TCRs. Our approach combines probabilistic models of TCR sequences with clone abundance information to extract TCR sequence motifs central to an antigen-specific response. We use this model to identify patient personalized TCR motifs that respond to individual tumor and infectious disease antigens, and to accurately discriminate specific from non-specific responses. Furthermore, the hidden structure of the model results in an interpretable representation space where TCRs responding to the same antigen cluster, correctly discriminating the response of TCR to different viral epitopes. The model can be used to identify condition specific responding TCRs. We focus on the examples of TCRs reactive to candidate neoantigens and selected epitopes in experiments of stimulated TCR clone expansion.

Journal article

Bravi B, Tubiana J, Cocco S, Monasson R, Mora T, Walczak AMet al., 2021, RBM-MHC: a semi-supervised machine-learning method for sample-specific prediction of antigen presentation by HLA-I alleles, Cell Systems, Vol: 12, Pages: 195-202.e9, ISSN: 2405-4712

The recent increase of immunopeptidomics data, obtained by mass spectrometry or binding assays, opens up possibilities for investigating endogenous antigen presentation by the highly polymorphic human leukocyte antigen class I (HLA-I) protein. State-of-the-art methods predict with high accuracy presentation by HLA alleles that are well represented in databases at the time of release but have a poorer performance for rarer and less characterized alleles. Here, we introduce a method based on Restricted Boltzmann Machines (RBMs) for prediction of antigens presented on the Major Histocompatibility Complex (MHC) encoded by HLA genes—RBM-MHC. RBM-MHC can be trained on custom and newly available samples with no or a small amount of HLA annotations. RBM-MHC ensures improved predictions for rare alleles and matches state-of-the-art performance for well-characterized alleles while being less data demanding. RBM-MHC is shown to be a flexible and easily interpretable method that can be used as a predictor of cancer neoantigens and viral epitopes, as a tool for feature discovery, and to reconstruct peptide motifs presented on specific HLA molecules.

Journal article

Bravi B, Rubin KJ, Sollich P, 2020, Systematic model reduction captures the dynamics of extrinsic noise in biochemical subnetworks., Journal of Chemical Physics, Vol: 153, Pages: 1-20, ISSN: 0021-9606

We consider the general problem of describing the dynamics of subnetworks of larger biochemical reaction networks, e.g., protein interaction networks involving complex formation and dissociation reactions. We propose the use of model reduction strategies to understand the "extrinsic" sources of stochasticity arising from the rest of the network. Our approaches are based on subnetwork dynamical equations derived by projection methods and path integrals. The results provide a principled derivation of different components of the extrinsic noise that is observed experimentally in cellular biochemical reactions, over and above the intrinsic noise from the stochasticity of biochemical events in the subnetwork. We explore several intermediate approximations to assess systematically the relative importance of different extrinsic noise components, including initial transients, long-time plateaus, temporal correlations, multiplicative noise terms, and nonlinear noise propagation. The best approximations achieve excellent accuracy in quantitative tests on a simple protein network and on the epidermal growth factor receptor signaling network.

Journal article

Bravi B, Ravasio R, Brito C, Wyart Met al., 2020, Direct coupling analysis of epistasis in allosteric materials, PLoS Computational Biology, Vol: 16, Pages: 1-19, ISSN: 1553-734X

In allosteric proteins, the binding of a ligand modifies function at a distant active site. Such allosteric pathways can be used as target for drug design, generating considerable interest in inferring them from sequence alignment data. Currently, different methods lead to conflicting results, in particular on the existence of long-range evolutionary couplings between distant amino-acids mediating allostery. Here we propose a resolution of this conundrum, by studying epistasis and its inference in models where an allosteric material is evolved in silico to perform a mechanical task. We find in our model the four types of epistasis (Synergistic, Sign, Antagonistic, Saturation), which can be both short or long-range and have a simple mechanical interpretation. We perform a Direct Coupling Analysis (DCA) and find that DCA predicts well the cost of point mutations but is a rather poor generative model. Strikingly, it can predict short-range epistasis but fails to capture long-range epistasis, in consistence with empirical findings. We propose that such failure is generic when function requires subparts to work in concert. We illustrate this idea with a simple model, which suggests that other methods may be better suited to capture long-range effects.

Journal article

Grigolon S, Bravi B, Martin OC, 2018, Responses to auxin signals: an operating principle for dynamical sensitivity yet high resilience, Royal Society Open Science, Vol: 5, Pages: 1-15, ISSN: 2054-5703

Plants depend on the signalling of the phytohormone auxin for their development and for responding to environmental perturbations. The associated biomolecular signalling network involves a negative feedback on Aux/IAA proteins which mediate the influence of auxin (the signal) on the auxin response factor (ARF) transcription factors (the drivers of the response). To probe the role of this feedback, we consider alternative in silico signalling networks implementing different operating principles. By a comparative analysis, we find that the presence of a negative feedback allows the system to have a far larger sensitivity in its dynamical response to auxin and that this sensitivity does not prevent the system from being highly resilient. Given this insight, we build a new biomolecular signalling model for quantitatively describing such Aux/IAA and ARF responses.

Journal article

Bravi B, Sollich P, 2017, Statistical physics approaches to subnetwork dynamics in biochemical systems, Physical Biology, Vol: 14, Pages: 1-27, ISSN: 1478-3967

We apply a Gaussian variational approximation to model reduction in large biochemical networks of unary and binary reactions. We focus on a small subset of variables (subnetwork) of interest, e.g. because they are accessible experimentally, embedded in a larger network (bulk). The key goal is to write dynamical equations reduced to the subnetwork but still retaining the effects of the bulk. As a result, the subnetwork-reduced dynamics contains a memory term and an extrinsic noise term with non-trivial temporal correlations. We first derive expressions for this memory and noise in the linearized (Gaussian) dynamics and then use a perturbative power expansion to obtain first order nonlinear corrections. For the case of vanishing intrinsic noise, our description is explicitly shown to be equivalent to projection methods up to quadratic terms, but it is applicable also in the presence of stochastic fluctuations in the original dynamics. An example from the epidermal growth factor receptor signalling pathway is provided to probe the increased prediction accuracy and computational efficiency of our method.

Journal article

Bravi B, Sollich P, 2017, Critical scaling in hidden state inference for linear Langevin dynamics, Journal of Statistical Mechanics: Theory and Experiment, Vol: 2017, Pages: 1-31, ISSN: 1742-5468

We consider the problem of inferring the dynamics of unknown (i.e. hidden) nodes from a set of observed trajectories and study analytically the average prediction error and the typical relaxation time of correlations between errors. We focus on a stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings in the infinite network size limit. The expected error on the hidden time courses can be found as the equal-time hidden-to-hidden covariance of the probability distribution conditioned on observations. In the stationary regime, we analyze the phase diagram in the space of relevant parameters, namely the ratio between the numbers of observed and hidden nodes, the degree of symmetry of the interactions and the amplitudes of the hidden-to-hidden and hidden-to-observed couplings relative to the decay constant of the internal hidden dynamics. In particular, we identify critical regions in parameter space where the relaxation time and the inference error diverge, and determine the corresponding scaling behaviour.

Journal article

Bravi B, Sollich P, 2017, Inference for dynamics of continuous variables: the extended Plefka expansion with hidden nodes, Journal of Statistical Mechanics: Theory and Experiment, Vol: 2017, Pages: 1-28, ISSN: 1742-5468

We consider the problem of a subnetwork of observed nodes embedded into a larger bulk of unknown (i.e. hidden) nodes, where the aim is to infer these hidden states given information about the subnetwork dynamics. The biochemical networks underlying many cellular and metabolic processes are important realizations of such a scenario as typically one is interested in reconstructing the time evolution of unobserved chemical concentrations starting from the experimentally more accessible ones. We present an application to this problem of a novel dynamical mean field approximation, the extended Plefka expansion, which is based on a path integral description of the stochastic dynamics. As a paradigmatic model we study the stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings. The resulting joint distribution is known to be Gaussian and this allows us to fully characterize the posterior statistics of the hidden nodes. In particular the equal-time hidden-to-hidden variance—conditioned on observations—gives the expected error at each node when the hidden time courses are predicted based on the observations. We assess the accuracy of the extended Plefka expansion in predicting these single node variances as well as error correlations over time, focussing on the role of the system size and the number of observed nodes.

Journal article

Bravi B, Opper M, Sollich P, 2017, Inferring hidden states in Langevin dynamics on large networks: Average case performance, Physical Review E, Vol: 95, ISSN: 2470-0045

We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We analyze the inference error, given by the variance of the posterior distribution over hidden paths, in the thermodynamic limit and as a function of the system parameters and the ratio α between the number of hidden and observed nodes. By applying Kalman filter recursions we find that the posterior dynamics is governed by an “effective” drift that incorporates the effect of the observations. We present two approaches for characterizing the posterior variance that allow us to tackle, respectively, equilibrium and nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals average spectral properties of the inference error and typical posterior relaxation times; the second is based on dynamical functionals and yields the inference error as the solution of an algebraic equation.

Journal article

Bravi B, Sollich P, Opper M, 2016, Extended Plefka expansion for stochastic dynamics, Journal of Physics A: Mathematical and Theoretical, Vol: 49, Pages: 1-39, ISSN: 1751-8113

We propose an extension of the Plefka expansion, which is well known for the dynamics of discrete spins, to stochastic differential equations with continuous degrees of freedom and exhibiting generic nonlinearities. The scenario is sufficiently general to allow application to e.g. biochemical networks involved in metabolism and regulation. The main feature of our approach is to constrain in the Plefka expansion not just first moments akin to magnetizations, but also second moments, specifically two-time correlations and responses for each degree of freedom. The end result is an effective equation of motion for each single degree of freedom, where couplings to other variables appear as a self-coupling to the past (i.e. memory term) and a coloured noise. This constitutes a new mean field approximation that should become exact in the thermodynamic limit of a large network, for suitably long-ranged couplings. For the analytically tractable case of linear dynamics we establish this exactness explicitly by appeal to spectral methods of random matrix theory, for Gaussian couplings with arbitrary degree of symmetry.

Journal article

Bravi B, Longo G, 2015, The Unconventionality of Nature: Biology, from Noise to Functional Randomness, Unconventional Computation & Natural Computation Conference (UCNC)

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00759150&limit=30&person=true