Imperial College London

DrKirillVeselkov

Faculty of MedicineDepartment of Surgery & Cancer

Lecturer
 
 
 
//

Contact

 

+44 (0)20 7594 3899kirill.veselkov04

 
 
//

Location

 

Sir Alexander Fleming BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

78 results found

Kamal F, Kumar S, Edwards MR, Veselkov K, Belluomo I, Kebadze T, Romano A, Trujillo-Torralbo M-B, Shahridan Faiez T, Walton R, Ritchie AI, Wiseman DJ, Laponogov I, Donaldson G, Wedzicha JA, Johnston SL, Singanayagam A, Hanna GBet al., 2021, Virus-induced volatile organic compounds are detectable in exhaled breath during pulmonary infection., American Journal of Respiratory and Critical Care Medicine, ISSN: 1073-449X

BACKGROUND: Chronic obstructive pulmonary disease (COPD) is a condition punctuated by acute exacerbations commonly triggered by viral and/or bacterial infection. Early identification of exacerbation trigger is important to guide appropriate therapy but currently available tests are slow and imprecise. Volatile organic compounds (VOCs) can be detected in exhaled breath and have the potential to be rapid tissue-specific biomarkers of infection aetiology. METHODS: We used serial sampling within in vitro and in vivo studies to elucidate the dynamic changes that occur in VOC production during acute respiratory viral infection. Highly sensitive gas-chromatography mass spectrometry (GC-MS) techniques were used to measure VOC production from infected airway epithelial cell cultures and in exhaled breath samples of healthy subjects experimentally challenged with rhinovirus A16 and COPD subjects with naturally-occurring exacerbations. RESULTS: We identified a novel VOC signature comprising of decane and other related long chain alkane compounds that is induced during rhinovirus infection of cultured airway epithelial cells and is also increased in the exhaled breath of healthy subjects experimentally challenged with rhinovirus and of COPD patients during naturally-occurring viral exacerbations. These compounds correlated with magnitude of anti-viral immune responses, virus burden and exacerbation severity but were not induced by bacterial infection, suggesting they represent a specific virus-inducible signature. CONCLUSION: Our study highlights the potential for measurement of exhaled breath VOCs as rapid, non-invasive biomarkers of viral infection. Further studies are needed to determine whether measurement of these signatures could be used to guide more targeted therapy with antibiotic/antiviral agents for COPD exacerbations.

Journal article

Borgas P, Gonzalez G, Veselkov K, Mirnezami Ret al., 2021, Phytochemically rich dietary components and the risk of colorectal cancer: A systematic review and meta-analysis of observational studies, World Journal of Clinical Oncology, Vol: 12, Pages: 482-499, ISSN: 2218-4333

BACKGROUNDPersonalized nutrition and protective diets and lifestyles represent a key cancer research priority. The association between consumption of specific dietary components and colorectal cancer (CRC) incidence has been evaluated by a number of population-based studies, which have identified certain food items as having protective potential, though the findings have been inconsistent. Herein we present a systematic review and meta-analysis on the potential protective role of five common phytochemically rich dietary components (nuts, cruciferous vegetables, citrus fruits, garlic and tomatoes) in reducing CRC risk.AIMTo investigate the independent impact of increased intake of specific dietary constituents on CRC risk in the general population.METHODSMedline and Embase were systematically searched, from time of database inception to January 31, 2020, for observational studies reporting CRC incidence relative to intake of one or more of nuts, cruciferous vegetables, citrus fruits, garlic and/or tomatoes in the general population. Data were extracted by two independent reviewers and analyzed in accordance with the Meta-analysis of Observational Studies in Epidemiology (MOOSE) and Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) reporting guidelines and according to predefined inclusion/exclusion criteria. Effect sizes of studies were pooled using a random-effects model.RESULTSForty-six studies were identified. CRC risk was significantly reduced in patients with higher vs lower consumption of cruciferous vegetables [odds ratio (OR) = 0.90; 95% confidence interval (CI): 0.85-0.95; P < 0.005], citrus fruits (OR = 0.90; 95%CI: 0.84-0.96; P < 0.005), garlic (OR = 0.83; 95%CI: 0.76-0.91; P < 0.005) and tomatoes (OR = 0.89; 95%CI: 0.84-0.95; P < 0.005). Subgroup analysis showed that this association sustained when looking at case-control studies alone, for all of these four food items, but no significant difference was found in analys

Journal article

Gonzalez G, Gong S, Laponogov I, Bronstein M, Veselkov Ket al., 2021, Predicting anticancer hyperfoods with graph convolutional networks, Human Genomics, Vol: 15, ISSN: 1479-7364

Background:Recent efforts in the field of nutritional science have allowed the discovery of disease-beating molecules within foods based on the commonality of bioactive food molecules to FDA-approved drugs. The pioneering work in this field used an unsupervised network propagation algorithm to learn the systemic-wide effect on the human interactome of 1962 FDA-approved drugs and a supervised algorithm to predict anticancer therapeutics using the learned representations. Then, a set of bioactive molecules within foods was fed into the model, which predicted molecules with cancer-beating potential.The employed methodology consisted of disjoint unsupervised feature generation and classification tasks, which can result in sub-optimal learned drug representations with respect to the classification task. Additionally, due to the disjoint nature of the tasks, the employed approach proved cumbersome to optimize, requiring testing of thousands of hyperparameter combinations and significant computational resources.To overcome the technical limitations highlighted above, we represent each drug as a graph (human interactome) with its targets as binary node features on the graph and formulate the problem as a graph classification task. To solve this task, inspired by the success of graph neural networks in graph classification problems, we use an end-to-end graph neural network model operating directly on the graphs, which learns drug representations to optimize model performance in the prediction of anticancer therapeutics.Results:The proposed model outperforms the baseline approach in the anticancer therapeutic prediction task, achieving an F1 score of 67.99%±2.52% and an AUPR of 73.91%±3.49%. It is also shown that the model is able to capture knowledge of biological pathways to predict anticancer molecules based on the molecules’ effects on cancer-related pathways.Conclusions:We introduce an end-to-end graph convolutional model to predict cancer-beating mo

Journal article

Vasiliou V, Veselkov K, Bruford E, Reichardt JKVet al., 2021, Standardized nomenclature and open science in Human Genomics., Human Genomics, Vol: 15, Pages: 13-13, ISSN: 1479-7364

Journal article

Laponogov I, Gonzalez G, Shepherd M, Qureshi A, Veselkov D, Charkoftaki G, Vasiliou V, Youssef J, Mirnezami R, Bronstein M, Veselkov Ket al., 2021, Network machine learning maps phytochemically rich "Hyperfoods" to fight COVID-19, Human Genomics, Vol: 15, Pages: 1-1, ISSN: 1479-7364

In this paper, we introduce a network machine learning method to identify potential bioactive anti-COVID-19 molecules in foods based on their capacity to target the SARS-CoV-2-host gene-gene (protein-protein) interactome. Our analyses were performed using a supercomputing DreamLab App platform, harnessing the idle computational power of thousands of smartphones. Machine learning models were initially calibrated by demonstrating that the proposed method can predict anti-COVID-19 candidates among experimental and clinically approved drugs (5658 in total) targeting COVID-19 interactomics with the balanced classification accuracy of 80-85% in 5-fold cross-validated settings. This identified the most promising drug candidates that can be potentially "repurposed" against COVID-19 including common drugs used to combat cardiovascular and metabolic disorders, such as simvastatin, atorvastatin and metformin. A database of 7694 bioactive food-based molecules was run through the calibrated machine learning algorithm, which identified 52 biologically active molecules, from varied chemical classes, including flavonoids, terpenoids, coumarins and indoles predicted to target SARS-CoV-2-host interactome networks. This in turn was used to construct a "food map" with the theoretical anti-COVID-19 potential of each ingredient estimated based on the diversity and relative levels of candidate compounds with antiviral properties. We expect this in silico predicted food map to play an important role in future clinical studies of precision nutrition interventions against COVID-19 and other viral diseases.

Journal article

Aksenov AA, Laponogov I, Zhang Z, Doran SLF, Belluomo I, Veselkov D, Bittremieux W, Nothias LF, Nothias-Esposito M, Maloney KN, Misra BB, Melnik AV, Smirnov A, Du X, Jones KL, Dorrestein K, Panitchpakdi M, Ernst M, van der Hooft JJJ, Gonzalez M, Carazzone C, Amézquita A, Callewaert C, Morton JT, Quinn RA, Bouslimani A, Orio AA, Petras D, Smania AM, Couvillion SP, Burnet MC, Nicora CD, Zink E, Metz TO, Artaev V, Humston-Fulmer E, Gregor R, Meijler MM, Mizrahi I, Eyal S, Anderson B, Dutton R, Lugan R, Boulch PL, Guitton Y, Prevost S, Poirier A, Dervilly G, Le Bizec B, Fait A, Persi NS, Song C, Gashu K, Coras R, Guma M, Manasson J, Scher JU, Barupal DK, Alseekh S, Fernie AR, Mirnezami R, Vasiliou V, Schmid R, Borisov RS, Kulikova LN, Knight R, Wang M, Hanna GB, Dorrestein PC, Veselkov Ket al., 2020, Auto-deconvolution and molecular networking of gas chromatography-mass spectrometry data, Nature Biotechnology, Vol: 39, Pages: 169-173, ISSN: 1087-0156

We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography-mass spectrometry (GC-MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC-MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.

Journal article

Abbassi-Ghadi N, Antonowicz S, McKenzie J, Kumar S, Huang J, Jones E, Strittmatter N, Petts G, Kudo H, court S, Hoare J, Veselkov K, Goldin R, Takats Z, Hanna Get al., 2020, De novo lipogenesis alters the phospholipidome of esophageal adenocarcinoma, Cancer Research, Vol: 80, Pages: 2764-2774, ISSN: 0008-5472

The incidence of esophageal adenocarcinoma is rising, survival remains poor, and new tools to improve early diagnosis and precise treatment are needed. Cancer phospholipidomes quantified with mass spectrometry imaging can support objective diagnosis in minutes using a routine frozen tissue section. However, whether mass spectrometry imaging can objectively identify primary esophageal adenocarcinoma is currently unknown and represents a significant challenge, as this microenvironment is complex with phenotypically similar tissue-types. Here we used desorption electrospray ionisation mass spectrometry imaging (DESI-MSI) and bespoke chemometrics to assess the phospholipidomes of esophageal adenocarcinoma and relevant control tissues. Multivariable models derived from phospholipid profiles of 117 patients were highly discriminant for esophageal adenocarcinoma both in discovery (area-under-curve = 0.97) and validation cohorts (AUC = 1). Among many other changes, esophageal adenocarcinoma samples were markedly enriched for polyunsaturated phosphatidylglycerols with longer acyl chains, with stepwise enrichment in pre-malignant tissues. Expression of fatty acid and glycerophospholipid synthesis genes was significantly upregulated, and characteristics of fatty acid acyls matched glycerophospholipid acyls. Mechanistically, silencing the carbon switch ACLY in esophageal adenocarcinoma cells shortened GPL chains, linking de novo lipogenesis to the phospholipidome. Thus, DESI-MSI can objectively identify invasive esophageal adenocarcinoma from a number of pre-malignant tissues and unveils mechanisms of phospholipidomic reprogramming. These results call for accelerated diagnosis studies using DESI-MSI in the upper gastrointestinal endoscopy suite as well as functional studies to determine how polyunsaturated phosphatidylglycerols contribute to esophageal carcinogenesis.

Journal article

Varshavi D, Varshavi D, McCarthy N, Veselkov K, Keun HC, Everett JRet al., 2020, Metabolic characterization of colorectal cancer cells harbouring different KRAS mutations in codon 12, 13, 61 and 146 using human SW48 isogenic cell lines, Metabolomics, Vol: 16, Pages: 1-13, ISSN: 1573-3882

IntroductionKirsten Rat Sarcoma Viral Oncogene Homolog (KRAS) mutations occur in approximately one-third of colorectal (CRC) tumours and have been associated with poor prognosis and resistance to some therapeutics. In addition to the well-documented pro-tumorigenic role of mutant Ras alleles, there is some evidence suggesting that not all KRAS mutations are equal and the position and type of amino acid substitutions regulate biochemical activity and transforming capacity of KRAS mutations.ObjectivesTo investigate the metabolic signatures associated with different KRAS mutations in codons 12, 13, 61 and 146 and to determine what metabolic pathways are affected by different KRAS mutations.MethodsWe applied an NMR-based metabonomics approach to compare the metabolic profiles of the intracellular extracts and the extracellular media from isogenic human SW48 CRC cell lines with different KRAS mutations in codons 12 (G12D, G12A, G12C, G12S, G12R, G12V), 13 (G13D), 61 (Q61H) and 146 (A146T) with their wild-type counterpart. We used false discovery rate (FDR)-corrected analysis of variance (ANOVA) to determine metabolites that were statistically significantly different in concentration between the different mutants.ResultsCRC cells carrying distinct KRAS mutations exhibited differential metabolic remodelling, including differences in glycolysis, glutamine utilization and in amino acid, nucleotide and hexosamine metabolism.ConclusionsMetabolic differences among different KRAS mutations might play a role in their different responses to anticancer treatments and hence could be exploited as novel metabolic vulnerabilities to develop more effective therapies against oncogenic KRAS.

Journal article

Giallourou N, Fardus-Reid F, Panic G, Veselkov K, McCormick BJJ, Olortegui MP, Ahmed T, Mduma E, Yori PP, Mahfuz M, Svensen E, Ahmed MMM, Colston JM, Kosek MN, Swann JRet al., 2020, Metabolic maturation in the first 2 years of life in resource-constrained settings and its association with postnatal growths, Science Advances, Vol: 6, Pages: 1-10, ISSN: 2375-2548

Malnutrition continues to affect the growth and development of millions of children worldwide, and chronic undernutrition has proven to be largely refractory to interventions. Improved understanding of metabolic development in infancy and how it differs in growth-constrained children may provide insights to inform more timely, targeted, and effective interventions. Here, the metabolome of healthy infants was compared to that of growth-constrained infants from three continents over the first 2 years of life to identify metabolic signatures of aging. Predictive models demonstrated that growth-constrained children lag in their metabolic maturity relative to their healthier peers and that metabolic maturity can predict growth 6 months into the future. Our results provide a metabolic framework from which future nutritional programs may be more precisely constructed and evaluated.

Journal article

Gonzalez G, Gong S, Laponogov I, Veselkov K, Bronstein Met al., 2020, Graph attentional autoencoder for anticancer hyperfood prediction, Publisher: arXiv

Recent research efforts have shown the possibility to discover anticancerdrug-like molecules in food from their effect on protein-protein interactionnetworks, opening a potential pathway to disease-beating diet design. Weformulate this task as a graph classification problem on which graph neuralnetworks (GNNs) have achieved state-of-the-art results. However, GNNs aredifficult to train on sparse low-dimensional features according to ourempirical evidence. Here, we present graph augmented features, integratinggraph structural information and raw node attributes with varying ratios, toease the training of networks. We further introduce a novel neural networkarchitecture on graphs, the Graph Attentional Autoencoder (GAA) to predict foodcompounds with anticancer properties based on perturbed protein networks. Wedemonstrate that the method outperforms the baseline approach andstate-of-the-art graph classification models in this task.

Working paper

Lowe ME, Andersen DK, Caprioli RM, Choudhary J, Cruz-Monserrate Z, Dasyam AK, Forsmark CE, Gorelick FS, Gray JW, Haupt M, Kelly KA, Olive KP, Plevritis SK, Rappaport N, Roth HR, Steen H, Swamidass SJ, Tirkes T, Uc A, Veselkov K, Whitcomb DC, Habtezion Aet al., 2019, Precision medicine in pancreatic disease-knowledge gaps and research opportunities: Summary of a national institute of diabetes and digestive and kidney diseases workshop, Pancreas, Vol: 48, Pages: 1250-1258, ISSN: 0885-3177

A workshop on research gaps and opportunities for Precision Medicine in Pancreatic Disease was sponsored by the National Institute of Diabetes and Digestive Kidney Diseases on July 24, 2019, in Pittsburgh. The workshop included an overview lecture on precision medicine in cancer and 4 sessions: (1) general considerations for the application of bioinformatics and artificial intelligence; (2) omics, the combination of risk factors and biomarkers; (3) precision imaging; and (4) gaps, barriers, and needs to move from precision to personalized medicine for pancreatic disease. Current precision medicine approaches and tools were reviewed, and participants identified knowledge gaps and research needs that hinder bringing precision medicine to pancreatic diseases. Most critical were (a) multicenter efforts to collect large-scale patient data sets from multiple data streams in the context of environmental and social factors; (b) new information systems that can collect, annotate, and quantify data to inform disease mechanisms; (c) novel prospective clinical trial designs to test and improve therapies; and (d) a framework for measuring and assessing the value of proposed approaches to the health care system. With these advances, precision medicine can identify patients early in the course of their pancreatic disease and prevent progression to chronic or fatal illness.

Journal article

Frasca F, Galeano D, Gonzalez G, Laponogov I, Veselkov K, Paccanaro A, Bronstein MMet al., 2019, Learning interpretable disease self-representations for drug repositioning, Publisher: arxiv

Drug repositioning is an attractive cost-efficient strategy for thedevelopment of treatments for human diseases. Here, we propose an interpretablemodel that learns disease self-representations for drug repositioning. Ourself-representation model represents each disease as a linear combination of afew other diseases. We enforce proximity in the learnt representations in a wayto preserve the geometric structure of the human phenome network - adomain-specific knowledge that naturally adds relational inductive bias to thedisease self-representations. We prove that our method is globally optimal andshow results outperforming state-of-the-art drug repositioning approaches. Wefurther show that the disease self-representations are biologicallyinterpretable.

Working paper

Everett JR, Holmes E, Veselkov KA, Lindon JC, Nicholson JKet al., 2019, A uified conceptual framework for metabolic phenotyping in diagnosis and prognosis, Trends in Pharmacological Sciences, Vol: 40, Pages: 763-773, ISSN: 0165-6147

Understanding metabotype (multicomponent metabolic characteristics) variation can help to generate new diagnostic and prognostic biomarkers, as well as models, with potential to impact on patient management. We present a suite of conceptual approaches for the generation, analysis, and understanding of metabotypes from body fluids and tissues. We describe and exemplify four fundamental approaches to the generation and utilization of metabotype data via multiparametric measurement of (i) metabolite levels, (ii) metabolic trajectories, (iii) metabolic entropies, and (iv) metabolic networks and correlations in space and time. This conceptual framework can underpin metabotyping in the scenario of personalized medicine, with the aim of improving clinical outcomes for patients, but the framework will have value and utility in areas of metabolic profiling well beyond this exemplar.

Journal article

Veselkov K, Gonzalez Pigorini G, Aljifri S, Galea D, Mirnezami R, Youssef J, Bronstein M, Laponogov Iet al., 2019, HyperFoods: Machine intelligent mapping of cancer-beating molecules in foods, Scientific Reports, Vol: 9, ISSN: 2045-2322

Recent data indicate that up-to 30–40% of cancers can be prevented by dietary and lifestyle measures alone. Herein, we introduce a unique network-based machine learning platform to identify putative food-based cancer-beating molecules. These have been identified through their molecular biological network commonality with clinically approved anti-cancer therapies. A machine-learning algorithm of random walks on graphs (operating within the supercomputing DreamLab platform) was used to simulate drug actions on human interactome networks to obtain genome-wide activity profiles of 1962 approved drugs (199 of which were classified as “anti-cancer” with their primary indications). A supervised approach was employed to predict cancer-beating molecules using these ‘learned’ interactome activity profiles. The validated model performance predicted anti-cancer therapeutics with classification accuracy of 84–90%. A comprehensive database of 7962 bioactive molecules within foods was fed into the model, which predicted 110 cancer-beating molecules (defined by anti-cancer drug likeness threshold of >70%) with expected capacity comparable to clinically approved anti-cancer drugs from a variety of chemical classes including flavonoids, terpenoids, and polyphenols. This in turn was used to construct a ‘food map’ with anti-cancer potential of each ingredient defined by the number of cancer-beating molecules found therein. Our analysis underpins the design of next-generation cancer preventative and therapeutic nutrition strategies.

Journal article

Poynter L, Mirnezami R, Galea D, Veselkov K, Nicholson J, Takats Z, Darzi A, Kinross J, Mirnezami Aet al., 2019, Network mapping of molecular biomarkers influencing radiation response in rectal cancer, Clinical Colorectal Cancer, Vol: 18, Pages: e210-e222, ISSN: 1533-0028

IntroductionPre-operative radiotherapy (RT) has an important role in the management of locally advanced rectal cancer (RC). Tumour regression following RT shows marked variability and robust molecular methods are needed with which to predict likely response. The aim of this study was to review the current published literature and employ Gene Ontology (GO) analysis to define key molecular biomarkers governing radiation response in RC.MethodsA systematic review of electronic bibliographic databases (MEDLINE, Embase) was performed for original articles published between 2000 and 2015. Biomarkers were then classified according to biological function and incorporated into a hierarchical GO tree. Both significant and non-significant results were included in the analysis. Significance was binarized based on uni- and multivariate statistics. Significance scores were calculated for each biological domain (or node), and a direct acyclic graph was generated for intuitive mapping of biological pathways and markers involved in rectal cancer radiation response.Results72 individual biomarkers, across 74 studies, were identified through review. On highest order classification, molecular biomarkers falling within the domains of response to stress, cellular metabolism and pathways inhibiting apoptosis were found to be the most influential in predicting radiosensitivity.ConclusionsHomogenising biomarker data from original articles using controlled GO terminology demonstrates that cellular mechanisms of response to radiotherapy in RC - in particular the metabolic response to radiotherapy - may hold promise in developing radiotherapeutic biomarkers with which to predict, and in the future modulate, radiation response.

Journal article

Koumpa FS, Xylas D, Konopka M, Galea D, Veselkov K, Antoniou A, Mehta A, Mirnezami Ret al., 2019, Colorectal peritoneal metastases: a systematic review of current and emerging trends in clinical and translational research, Gastroenterology Research and Practice, Vol: 2019, Pages: 1-30, ISSN: 1687-6121

Colorectal peritoneal metastases (CPM) are associated with abbreviated survival and significantly impaired quality of life. In patients with CPM, radical multimodality treatment consisting of cytoreductive surgery (CRS) combined with hyperthermic intraperitoneal chemotherapy (HIPEC) has demonstrated oncological superiority over systemic chemotherapy alone. In highly selected patients undergoing CRS + HIPEC, overall survival of over 60% has been reported in some series. These are patients in whom the disease burden is limited and where the diagnosis is made at an early stage in the disease course. Early diagnosis and a deeper understanding of the biological mechanisms that regulate CPM are critical to refining patient selection for radical treatment, personalising therapeutic approaches, enhancing prognostication, and ultimately improving long-term survivorship. In the present study, we outline three broad themes which represent critical future research targets in CPM: (1) enhanced radiological strategies for early detection and staging; (2) identification and validation of translational biomarkers for diagnostic, prognostic, and therapeutic deployment; and (3) development of optimized approaches for surgical cytoreduction as well as more precise strategies for intraperitoneal drug selection and delivery. Herein, we provide a contemporary narrative review of the state of the art in these three areas. A systematic review in accordance with PRISMA guidelines was undertaken on all English language studies published between 2007 and 2017. In vitro and animal model studies were deemed eligible for inclusion in the sections pertaining to biomarkers and therapeutic optimisation, as these areas of research currently remain in the early stages of development. Acquired data were then divided into hierarchical thematic categories (imaging modalities, translational biomarkers (diagnostic/prognostic/therapeutic), and delivery techniques) and subcategories. An interactive sunburst

Journal article

Veselkov K, Schuller B, 2018, The age of data analytics: converting biomedical data into actionable insights., Methods, Vol: 151, Pages: 1-2

Journal article

Gu Q, Veselkov K, 2018, Bi-clustering of metabolic data using matrix factorization tools, Methods, Vol: 151, Pages: 12-20, ISSN: 1046-2023

Metabolic phenotyping technologies based on Nuclear Magnetic Spectroscopy (NMR) and Mass Spectrometry (MS) generate vast amounts of unrefined data from biological samples. Clustering strategies are frequently employed to provide insight into patterns of relationships between samples and metabolites. Here, we propose the use of a non-negative matrix factorization driven bi-clustering strategy for metabolic phenotyping data in order to discover subsets of interrelated metabolites that exhibit similar behaviour across samples. The proposed strategy incorporates bi-cross validation and statistical segmentation techniques to automatically determine the number and structure of bi-clusters. This alternative approach is in contrast to the widely used conventional clustering approaches that incorporate all molecular peaks for clustering in metabolic studies and require a priori specification of the number of clusters. We perform the comparative analysis of the proposed strategy with other bi-clustering approaches, which were developed in the context of genomics and transcriptomics research. We demonstrate the superior performance of the proposed bi-clustering strategy on both simulated (NMR) and real (MS) bacterial metabolic data.

Journal article

Galea D, Laponogov I, Veselkov K, 2018, Exploiting and assessing multi-source data for supervised biomedical named entity recognition, Bioinformatics, Vol: 34, Pages: 2472-2482, ISSN: 1367-4803

Motivation:Recognition of biomedical entities from scientific text is a critical component of naturallanguage processing and automated information extraction platforms. Modern named entity recognitionapproaches rely heavily on supervised machine learning techniques, which are critically dependent onannotated training corpora. These approaches have been shown toperform well when trained and testedon the same source. However, in such scenario, the performanceand evaluation of these models may beoptimistic, as such models may not necessarily generalize to independent corpora, resulting in potentialnon-optimal entity recognition for large-scale tagging of widely diverse articles in databases such asPubMed.Results:Here we aggregated published corpora for the recognition of biomolecular entities (such asgenes, RNA, proteins, variants, drugs, and metabolites), identified entity class overlap and performedleave-corpus-out cross validation strategy to test the efficiency of existing models. We demonstratethat accuracies of models trained on individual corpora decrease substantially for recognition of thesame biomolecular entity classes in independent corpora. This behavior is possibly due to limitedgeneralizability of entity-class-related features captured by individual corpora (model “overtraining”) whichwe investigated further at the orthographic level, as well as potential annotation standard differences.We show that the combined use of multi-source training corpora results in overall more generalizablemodels for named entity recognition, while achieving comparable individual performance. By performinglearning-curve-based power analysis we further identified thatperformance is often not limited by thequantity of the annotated data.

Journal article

Galea D, Laponogov I, Veselkov KA, 2018, Sub-word information in pre-trained biomedical word representations: evaluation and hyper-parameter optimization, Annual Computational Linguistics

Conference paper

Laponogov I, Sadawi N, Galea D, Mirnezami R, Veselkov Ket al., 2018, ChemDistiller: an engine for metabolite annotation in mass spectrometry, Bioinformatics, Vol: 34, Pages: 2096-2102, ISSN: 1367-4803

MotivationHigh-resolution mass spectrometry permits simultaneous detection of thousands of different metabolites in biological samples; however, their automated annotation still presents a challenge due to the limited number of tailored computational solutions freely available to the scientific community.ResultsHere, we introduce ChemDistiller, a customizable engine that combines automated large-scale annotation of metabolites using tandem MS data with a compiled database containing tens of millions of compounds with pre-calculated ‘fingerprints’ and fragmentation patterns. Our tests using publicly and commercially available tandem MS spectra for reference compounds show retrievals rates comparable to or exceeding the ones obtainable by the current state-of-the-art solutions in the field while offering higher throughput, scalability and processing speed.

Journal article

Varshavi D, Scott FH, Varshavi D, Veeravalli S, Phillips IR, Veselkov K, Strittmatter N, Takats Z, Shephard EA, Everett JRet al., 2018, Metabolic biomarkers of ageing in C57BL/6J wild-type and flavin-containing monooxygenase 5 (FMO5)-knockout mice, Frontiers in Molecular Biosciences, Vol: 5, ISSN: 2296-889X

It was recently demonstrated in mice that knockout of the flavin-containing monooxygenase 5 gene, Fmo5, slows metabolic ageing via pleiotropic effects. We have now used an NMR-based metabonomics approach to study the effects of ageing directly on the metabolic profiles of urine and plasma from male, wild-type C57BL/6J and Fmo5-/- (FMO5 KO) mice back-crossed onto the C57BL/6J background. The aim of this study was to identify metabolic signatures that are associated with ageing in both these mouse lines and to characterize the age-related differences in the metabolite profiles between the FMO5 KO mice and their wild-type counterparts at equivalent time points. We identified a range of age-related biomarkers in both urine and plasma. Some metabolites, including urinary 6-hydroxy-6-methylheptan-3-one (6H6MH3O), a mouse sex pheromone, showed similar patterns of changes with age, regardless of genetic background. Others, however, were altered only in the FMO5 KO, or only in the wild-type mice, indicating the impact of genetic modifications on mouse ageing. Elevated concentrations of urinary taurine represent a distinctive, ageing-related change observed only in wild-type mice.

Journal article

Veselkov KA, Sleeman J, Claude E, Vissers J, Galea D, Mroz A, Laponogov I, Towers M, Tonge R, Mirnezami R, Takats Z, Nicholson J, Langridge Jet al., 2018, BASIS: High-performance bioinformatics platform for processing of large-scale mass spectrometry imaging data in chemically augmented histology, Scientific Reports, Vol: 8, ISSN: 2045-2322

Mass Spectrometry Imaging (MSI) holds significant promise in augmenting digital histopathologic analysis by generating highly robust big data about the metabolic, lipidomic and proteomic molecular content of the samples. In the process, a vast quantity of unrefined data, that can amount to several hundred gigabytes per tissue section, is produced. Managing, analysing and interpreting this data is a significant challenge and represents a major barrier to the translational application of MSI. Existing data analysis solutions for MSI rely on a set of heterogeneous bioinformatics packages that are not scalable for the reproducible processing of large-scale (hundreds to thousands) biological sample sets. Here, we present a computational platform (pyBASIS) capable of optimized and scalable processing of MSI data for improved information recovery and comparative analysis across tissue specimens using machine learning and related pattern recognition approaches. The proposed solution also provides a means of seamlessly integrating experimental laboratory data with downstream bioinformatics interpretation/analyses, resulting in a truly integrated system for translational MSI.

Journal article

Charkoftaki G, Rattray NJW, Andrén PE, Caprioli RM, Castellino S, Duncan MW, Goodwin RJA, Schey KL, Shahidi-Latham SK, Veselkov KA, Johnson CH, Vasiliou Vet al., 2018, Yale School of Public Health Symposium on tissue imaging mass spectrometry: illuminating phenotypic heterogeneity and drug disposition at the molecular level., Human Genomics, Vol: 12, Pages: 10-10, ISSN: 1479-7364

Journal article

Bhome R, Goh RW, Bullock MD, Pillar N, Thirdborough SM, Mellone M, Mirnezami R, Galea D, Veselkov K, Gu Q, Underwood TJ, Primrose JN, De Wever O, Shomron N, Sayan AE, Mirnezami AHet al., 2017, Exosomal microRNAs derived from colorectal cancer-associated fibroblasts: role in driving cancer progression, Aging-US, Vol: 9, Pages: 2666-2694, ISSN: 1945-4589

Colorectal cancer is a global disease with increasing incidence. Mortality is largely attributed to metastatic spread and therefore, a mechanistic dissection of the signals which influence tumor progression is needed. Cancer stroma plays a critical role in tumor proliferation, invasion and chemoresistance. Here, we sought to identify and characterize exosomal microRNAs as mediators of stromal-tumor signaling. In vitro, we demonstrated that fibroblast exosomes are transferred to colorectal cancer cells, with a resultant increase in cellular microRNA levels, impacting proliferation and chemoresistance. To probe this further, exosomal microRNAs were profiled from paired patient-derived normal and cancer-associated fibroblasts, from an ongoing prospective biomarker study. An exosomal cancer-associated fibroblast signature consisting of microRNAs 329, 181a, 199b, 382, 215 and 21 was identified. Of these, miR-21 had highest abundance and was enriched in exosomes. Orthotopic xenografts established with miR-21-overexpressing fibroblasts and CRC cells led to increased liver metastases compared to those established with control fibroblasts. Our data provide a novel stromal exosome signature in colorectal cancer, which has potential for biomarker validation. Furthermore, we confirmed the importance of stromal miR-21 in colorectal cancer progression using an orthotopic model, and propose that exosomes are a vehicle for miR-21 transfer between stromal fibroblasts and cancer cells.

Journal article

Galea D, Inglese P, Cammack L, Strittmatter N, Rebec M, Mirnezami R, Laponogov I, Kinross J, Nicholson J, Takats Z, Veselkov KAet al., 2017, Translational utility of a hierarchical classification strategy in biomolecular data analytics., Scientific Reports, Vol: 7, ISSN: 2045-2322

Hierarchical classification (HC) stratifies and classifies data from broad classes into more specific classes. Unlike commonly used data classification strategies, this enables the probabilistic prediction of unknown classes at different levels, minimizing the burden of incomplete databases. Despite these advantages, its translational application in biomedical sciences has been limited. We describe and demonstrate the implementation of a HC approach for "omics-driven" classification of 15 bacterial species at various taxonomic levels achieving 90-100% accuracy, and 9 cancer types into morphological types and 35 subtypes with 99% and 76% accuracy, respectively. Unknown bacterial species were probabilistically assigned with 100% accuracy to their respective genus or family using mass spectra (n = 284). Cancer types were predicted by mRNA data (n = 1960) for most subtypes with 95-100% accuracy. This has high relevance in clinical practice where complete datasets are difficult to compile with the continuous evolution of diseases and emergence of new strains, yet prediction of unknown classes, such as bacterial species, at upper hierarchy levels may be sufficient to initiate antimicrobial therapy. The algorithms presented here can be directly translated into clinical-use with any quantitative data, and have broad application potential, from unlabeled sample identification, to hierarchical feature selection, and discovery of new taxonomic variants.

Journal article

Kinross J, Mirnezami R, Alexander J, Brown R, Scott A, Galea D, Veselkov K, Goldin R, Darzi A, Nicholson J, Marchesi JRet al., 2017, A prospective analysis of mucosal microbiome-metabonome interactions in colorectal cancer using a combined MAS 1HNMR and metataxonomic strategy, Scientific Reports, Vol: 7, ISSN: 2045-2322

Colon cancer induces a state of mucosal dysbiosis with associated niche specific changes in the gut microbiota. However, the key metabolic functions of these bacteria remain unclear. We performed a prospective observational study in patients undergoing elective surgery for colon cancer without mechanical bowel preparation (n = 18). Using 16 S rRNA gene sequencing we demonstrated that microbiota ecology appears to be cancer stage-specific and strongly associated with histological features of poor prognosis. Fusobacteria (p < 0.007) and ε- Proteobacteria (p < 0.01) were enriched on tumour when compared to adjacent normal mucosal tissue, and fusobacteria and β-Proteobacteria levels increased with advancing cancer stage (p = 0.014 and 0.002 respecitvely). Metabonomic analysis using 1H Magic Angle Spinning Nuclear Magnetic Resonsance (MAS-NMR) spectroscopy, demonstrated increased abundance of taurine, isoglutamine, choline, lactate, phenylalanine and tyrosine and decreased levels of lipids and triglycerides in tumour relative to adjacent healthy tissue. Network analysis revealed that bacteria associated with poor prognostic features were not responsible for the modification of the cancer mucosal metabonome. Thus the colon cancer mucosal microbiome evolves with cancer stage to meet the demands of cancer metabolism. Passenger microbiota may play a role in the maintenance of cancer mucosal metabolic homeostasis but these metabolic functions may not be stage specific.

Journal article

Tillner J, Wu V, Jones EA, Pringle SD, Karancsi T, Dannhorn A, Veselkov K, McKenzie JS, Takats Zet al., 2017, Faster, more reproducible DESI-MS for biological tissue imaging, Journal of The American Society for Mass Spectrometry, Vol: 28, Pages: 2090-2098, ISSN: 1044-0305

A new, more robust sprayer for desorption electrospray ionization (DESI) mass spectrometry imaging is presented. The main source of variability in DESI is thought to be the uncontrolled variability of various geometric parameters of the sprayer, primarily the position of the solvent capillary, or more specifically, its positioning within the gas capillary or nozzle. If the solvent capillary is off-center, the sprayer becomes asymmetrical, making the geometry difficult to control and compromising reproducibility. If the stiffness, tip quality, and positioning of the capillary are improved, sprayer reproducibility can be improved by an order of magnitude. The quality of the improved sprayer and its potential for high spatial resolution imaging are demonstrated on human colorectal tissue samples by acquisition of images at pixel sizes of 100, 50, and 20 μm, which corresponds to a lateral resolution of 40-60 μm, similar to the best values published in the literature. The high sensitivity of the sprayer also allows combination with a fast scanning quadrupole time-of-flight mass spectrometer. This provides up to 30 times faster DESI acquisition, reducing the overall acquisition time for a 10 mm × 10 mm rat brain sample to approximately 1 h. Although some spectral information is lost with increasing analysis speed, the resulting data can still be used to classify tissue types on the basis of a previously constructed model. This is particularly interesting for clinical applications, where fast, reliable diagnosis is required. Graphical Abstract ᅟ.

Journal article

Poynter LR, Veselkov K, Galea D, Kinross J, Mirnezami A, Nicholson J, Takats Z, Mirnezami R, Darzi Aet al., 2017, Network-driven analytics of published tissue-based biomarkers to predict response to neoadjuvant therapy in rectal cancer, Annual Meeting of the American-Association-for-Cancer-Research (AACR), Publisher: AMER ASSOC CANCER RESEARCH, ISSN: 0008-5472

Conference paper

Antcliffe D, Jimenez B, Veselkov K, Holmes E, Gordon ACet al., 2017, Metabolic profiling in patients with pneumonia on intensive care, EBioMedicine, Vol: 18, Pages: 244-253, ISSN: 2352-3964

Clinical features and investigations lack predictive value when diagnosing pneumonia, especially when patients are ventilated and when patients develop ventilator associated pneumonia (VAP). New tools to aid diagnosis are important to improve outcomes. This pilot study examines the potential for metabolic profiling to aid the diagnosis in critical care.In this prospective observational study ventilated patients with brain injuries or pneumonia were recruited in the intensive care unit and serum samples were collected soon after the start of ventilation. Metabolic profiles were produced using 1D 1H NMR spectra. Metabolic data were compared using multivariate statistical techniques including Principal Component Analysis (PCA) and Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA).We recruited 15 patients with pneumonia and 26 with brain injuries, seven of whom went on to develop VAP. Comparison of metabolic profiles using OPLS-DA differentiated those with pneumonia from those with brain injuries (R2Y = 0.91, Q2Y = 0.28, p = 0.02) and those with VAP from those without (R2Y = 0.94, Q2Y = 0.27, p = 0.05). Metabolites that differentiated patients with pneumonia included lipid species, amino acids and glycoproteins.Metabolic profiling shows promise to aid in the diagnosis of pneumonia in ventilated patients and may allow a more timely diagnosis and better use of antibiotics.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00427406&limit=30&person=true