Imperial College London

DrMarinaEvangelou

Faculty of Natural SciencesDepartment of Mathematics

Senior Lecturer in Statistics
 
 
 
//

Contact

 

+44 (0)20 7594 7184m.evangelou

 
 
//

Location

 

546Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Rodosthenous:2020:bioinformatics/btaa530,
author = {Rodosthenous, T and Shahrezaei, V and Evangelou, M},
doi = {bioinformatics/btaa530},
journal = {Bioinformatics},
pages = {4616--4625},
title = {Integrating multi-OMICS data through sparse Canonical Correlation Analysis for the prediction of complex traits: A comparison study},
url = {http://dx.doi.org/10.1093/bioinformatics/btaa530},
volume = {36},
year = {2020}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - MotivationRecent developments in technology have enabled researchers to collect multiple OMICS datasets for the same individuals. The conventional approach for understanding the relationships between the collected datasets and the complex trait of interest would be through the analysis of each OMIC dataset separately from the rest, or to test for associations between the OMICS datasets. In this work we show that integrating multiple OMICS datasets together, instead of analysing them separately, improves our understanding of their in-between relationships as well as the predictive accuracy for the tested trait. Several approaches have been proposed for the integration of heterogeneous and high-dimensional (p n) data, such as OMICS. The sparse variant of Canonical Correlation Analysis (CCA) approach is a promising one that seeks to penalise the canonical variables for producing sparse latent variables while achieving maximal correlation between the datasets. Over the last years, a number of approaches for implementing sparse CCA (sCCA) have been proposed, where they differ on their objective functions, iterative algorithm for obtaining the sparse latent variables and make different assumptions about the original datasets.ResultsThrough a comparative study we have explored the performance of the conventional CCA proposed by Parkhomenko et al. (2009), penalised matrix decomposition CCA proposed by Witten and Tibshirani (2009) and its extension proposed by Suo et al. (2017). The aforementioned methods were modified to allow for different penalty functions. Although sCCA is an unsupervised learning approach for understanding of the in-between relationships, we have twisted the problem as a supervised learning one and investigated how the computed latent variables can be used for predicting complex traits. The approaches were extended to allow for multiple (more than two) datasets where the trait was included as one of the input datasets. Both ways have shown improvement
AU - Rodosthenous,T
AU - Shahrezaei,V
AU - Evangelou,M
DO - bioinformatics/btaa530
EP - 4625
PY - 2020///
SN - 1367-4803
SP - 4616
TI - Integrating multi-OMICS data through sparse Canonical Correlation Analysis for the prediction of complex traits: A comparison study
T2 - Bioinformatics
UR - http://dx.doi.org/10.1093/bioinformatics/btaa530
UR - https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btaa530/5841662
UR - http://hdl.handle.net/10044/1/80242
VL - 36
ER -