Imperial College London

DrMarinaEvangelou

Faculty of Natural SciencesDepartment of Mathematics

Senior Lecturer in Statistics
 
 
 
//

Contact

 

+44 (0)20 7594 7184m.evangelou

 
 
//

Location

 

546Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

43 results found

Komodromos M, Aboagye EO, Evangelou M, Filippi S, Ray Ket al., 2022, Variational Bayes for high-dimensional proportional hazards models with applications within gene expression, Bioinformatics, ISSN: 1367-4803

Motivation:Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.Results:We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as SVB. Our method, based on a mean-field variational approximation, overcomes the high computational cost of MCMC whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.Availability and implementation:our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).

Journal article

Komodromos M, Aboagye EO, Evangelou M, Filippi S, Ray Ket al., 2022, Variational Bayes for high-dimensional proportional hazards models with applications within gene expression, BIOINFORMATICS, Vol: 38, Pages: 3918-3926, ISSN: 1367-4803

Journal article

Evangelou M, Rodosthenous T, Shahrezaei V, 2021, Semi-Supervised Classification and Visualization of Multi-View Data, JSM 2021 - Section on Statistical Learning and Data Science

Journal article

Rodosthenous T, Shahrezaei V, Evangelou M, 2021, S-multi-SNE: Semi-supervised classification and visualisation of multi-view data, Publisher: arXiv

An increasing number of multi-view data are being published by studies in several fields. This type of data corresponds to multiple data-views, each representing a different aspect of the same set of samples. We have recently proposed multi-SNE, an extension of t-SNE, that produces a single visualisation of multi-view data. The multi-SNE approach provides low-dimensional embeddings of the samples, produced by being updated iteratively through the different data-views. Here, we further extend multi-SNE to a semi-supervised approach, that classifies unlabelled samples by regarding the labelling information as an extra data-view. We look deeper into the performance, limitations and strengths of multi-SNE and its extension, S-multi-SNE, by applying the two methods on various multi-view datasets with different challenges. We show that by including the labelling information, the projection of the samples improves drastically and it is accompanied by a strong classification performance.

Working paper

Rodosthenous T, Shahrezaei V, Evangelou M, 2021, S-multi-SNE: Semi-supervised classification and visualisation of multi-view data

An increasing number of multi-view data are being published by studies in several fields. This type of data corresponds to multiple data-views, each representing a different aspect of the same set of samples. We have recently proposed multi-SNE, an extension of t-SNE, that produces a single visualisation of multi-view data. The multi-SNE approach provides low-dimensional embeddings of the samples, produced by being updated iteratively through the different data-views. Here, we further extend multi-SNE to a semi-supervised approach, that classifies unlabelled samples by regarding the labelling information as an extra data-view. We look deeper into the performance, limitations and strengths of multi-SNE and its extension, S-multi-SNE, by applying the two methods on various multi-view datasets with different challenges. We show that by including the labelling information, the projection of the samples improves drastically and it is accompanied by a strong classification performance.

Working paper

van Vliet NA, Bos MM, Thesing CS, Chaker L, Pietzner M, Houtman E, Neville MJ, Li-Gao R, Trompet S, Mustafa R, Ahmadizar F, Beekman M, Bot M, Budde K, Christodoulides C, Dehghan A, Delles C, Elliott P, Evangelou M, Gao H, Ghanbari M, van Herwaarden AE, Ikram MA, Jaeger M, Jukema JW, Karaman I, Karpe F, Kloppenburg M, Meessen JMTA, Meulenbelt I, Milaneschi Y, Mooijaart SP, Mook-Kanamori DO, Netea MG, Netea-Maier RT, Peeters RP, Penninx BWJH, Sattar N, Slagboom PE, Suchiman HED, Volzke H, Willems van Dijk K, Noordam R, van Heemst Det al., 2021, Higher thyrotropin leads to unfavorable lipid profile and somewhat higher cardiovascular disease risk: evidence from multi-cohort Mendelian randomization and metabolomic profiling, BMC Medicine, Vol: 19, Pages: 1-13, ISSN: 1741-7015

BackgroundObservational studies suggest interconnections between thyroid status, metabolism, and risk of coronary artery disease (CAD), but causality remains to be proven. The present study aimed to investigate the potential causal relationship between thyroid status and cardiovascular disease and to characterize the metabolomic profile associated with thyroid status.MethodsMulti-cohort two-sample Mendelian randomization (MR) was performed utilizing genome-wide significant variants as instruments for standardized thyrotropin (TSH) and free thyroxine (fT4) within the reference range. Associations between TSH and fT4 and metabolic profile were investigated in a two-stage manner: associations between TSH and fT4 and the full panel of 161 metabolomic markers were first assessed hypothesis-free, then directional consistency was assessed through Mendelian randomization, another metabolic profile platform, and in individuals with biochemically defined thyroid dysfunction.ResultsCirculating TSH was associated with 52/161 metabolomic markers, and fT4 levels were associated with 21/161 metabolomic markers among 9432 euthyroid individuals (median age varied from 23.0 to 75.4 years, 54.5% women). Positive associations between circulating TSH levels and concentrations of very low-density lipoprotein subclasses and components, triglycerides, and triglyceride content of lipoproteins were directionally consistent across the multivariable regression, MR, metabolomic platforms, and for individuals with hypo- and hyperthyroidism. Associations with fT4 levels inversely reflected those observed with TSH. Among 91,810 CAD cases and 656,091 controls of European ancestry, per 1-SD increase of genetically determined TSH concentration risk of CAD increased slightly, but not significantly, with an OR of 1.03 (95% CI 0.99–1.07; p value 0.16), whereas higher genetically determined fT4 levels were not associated with CAD risk (OR 1.00 per SD increase of fT4; 95% CI 0.96–1.04;

Journal article

Mustafa R, Mens MMJ, Huang J, Roshchupkin G, Uitterlinden AG, Ikram MA, Evangelou M, Ghanbari M, Dehghan Aet al., 2021, Associations of Circulatory MicroRNAs and Clinical Traits: A Phenome-wide Mendelian Randomization Analysis, Publisher: WILEY, Pages: 777-778, ISSN: 0741-0395

Conference paper

Rodriguez A, 2021, The link between Attention Deficit Hyperactivity Disorder (ADHD) symptoms and obesity-related traits: Genetic and prenatal explanations, Translational Psychiatry, Vol: 11, Pages: 1-8, ISSN: 2158-3188

Attention-deficit/hyperactivity disorder (ADHD) often co-occurs with obesity, however the potential causality between the traits remains unclear. We examined both genetic and prenatal evidence for causality using Mendelian Randomisation (MR) and polygenic risk scores (PRS). We conducted bi-directional MR on ADHD liability and six obesity-related traits using summary statistics from the largest available meta-analyses of genome-wide association studies. We also examined the shared genetic aetiology between ADHD symptoms (inattention and hyperactivity) and body mass index (BMI) by PRS association analysis using longitudinal data from Northern Finland Birth Cohort 1986 (NFBC1986, n = 2984). Lastly, we examined the impact of prenatal environment by association analysis of maternal pre-pregnancy BMI and offspring ADHD symptoms, adjusted for PRS of both traits, in NFBC1986 dataset. Through MR analyses, we found evidence for bidirectional causality between ADHD liability and obesity-related traits. PRS association analyses showed evidence for genetic overlap between ADHD symptoms and BMI. We found no evidence for a difference between inattention and hyperactivity symptoms, suggesting that neither symptom subtype is driving the association. We found evidence for association between maternal pre-pregnancy BMI and offspring ADHD symptoms after adjusting for both BMI and ADHD PRS (association p-value = 0.027 for inattention, p = 0.008 for hyperactivity). These results are consistent with the hypothesis that the co-occurrence between ADHD and obesity has both genetic and prenatal environmental origins.

Journal article

Adams N, Riddle-Workman E, Evangelou M, 2021, Multi-Type relational clustering for enterprise cyber-security networks, Pattern Recognition Letters, Vol: 149, Pages: 172-178, ISSN: 0167-8655

Several cyber-security data sources are collected in enterprise networks providing relational information between different types of nodes in the network, namely computers, users and ports. This relational data can be expressed as adjacency matrices detailing inter-type relationships corresponding to relations between nodes of different types and intra-type relationships showing relationships between nodes of the same type. In this paper, we propose an extension of Non-Negative Matrix Tri-Factorisation (NMTF) to simultaneously cluster nodes based on their intra and inter-type relationships. Existing NMTF based clustering methods suffer from long computational times due to large matrix multiplications. In our approach, we enforce stricter cluster indicator constraints on the factor matrices to circumvent these issues. Additionally, to make our proposed approach less susceptible to variation in results due to random initialisation, we propose a novel initialisation procedure based on Non-Negative Double Singular Value Decomposition for multi-type relational clustering. Finally, a new performance measure suitable for assessing clustering performance on unlabelled multi-type relational data sets is presented. Our algorithm is assessed on both a simulated and real computer network against standard approaches showing its strong performance.

Journal article

Rodosthenous T, Shahrezaei V, Evangelou M, 2021, Semi-supervised classification and visualisation of multi-view data, Joint Statistics Meeting (JSM) 2021, Publisher: American Statistical Association

An increasing number of multi-view data are being published by studies in several fields. This type of data corresponds to multiple data-views, each representing a different aspect of the same set of samples. We have recently proposed multi-SNE, an extension of t-SNE, that produces a single visualisation of multi-view data. The multi-SNE approach provides low-dimensional embeddings of the samples, produced by being updated iteratively through the different data-views. Here, we further extend multi-SNE to a semi-supervised approach, that classifies unlabelled samples byregarding the labelling information as an extra data-view. We look deeper into the performance, limitations and strengths of multi-SNE and its extension, S-multi-SNE, by applying the two methods on various multi-view datasets with different challenges. We show that by including the labelling information, the projection of the samples improves drastically and it is accompanied by a strong classification performance.

Conference paper

Van Vliet NA, Bos MM, Thesing CS, Chaker L, Pietzner M, Houtman E, Neville MJ, Li-Gao R, Trompet S, Mustafa R, Ahmadizar F, Beekman M, Bot M, Budde K, Christodoulides C, Dehghan A, Delles C, Elliott P, Evangelou M, Gao H, Ghanbari M, Van Herwaarden AE, Ikram MA, Jaeger M, Jukema JW, Karaman I, Karpe F, Kloppenburg M, Meessen JMTA, Meulenbelt I, Milaneschi Y, Mooijaart SP, Mook-Kanamori DO, Netea MG, Netea-Maier RT, Peeters RP, Penninx BWJH, Sattar N, Slagboom PE, Suchiman HED, Volzke H, Van Dijk KW, Noordam Ret al., 2021, HIGHER THYROID STIMULATING HORMONE LEADS TO CARDIOVASCULAR DISEASE AND AN UNFAVORABLE LIPID PROFILE: EVIDENCE FROM MULTI-COHORT MENDELIAN RANDOMIZATION AND METABOLOMIC PROFILING, Publisher: ELSEVIER IRELAND LTD, Pages: E40-E40, ISSN: 0021-9150

Conference paper

Frainay C, Pitarch Y, Filippi S, Evangelou M, Custovic Aet al., 2021, Atopic dermatitis or eczema? Consequences of ambiguity in disease name for biomedical literature mining, Clinical and Experimental Allergy, Vol: 51, Pages: 1185-1194, ISSN: 0954-7894

BackgroundBiomedical research increasingly relies on computational approaches to extract relevant information from large corpora of publications.ObjectiveTo investigate the consequence of the ambiguity between the use of terms “Eczema” and “Atopic Dermatitis” (AD) from the Information Retrieval perspective, and its impact on meta-analyses, systematic reviews and text mining.MethodsArticles were retrieved by querying the PubMed using terms ‘eczema’ (D003876) and “dermatitis, atopic” (D004485). We used machine learning to investigate the differences between the contexts in which each term is used. We used a decision tree approach and trained model to predict if an article would be indexed with eczema or AD tags. We used text-mining tools to extract biological entities associated with eczema and AD, and investigated the discrepancy regarding the retrieval of key findings according to the terminology used.ResultsAtopic dermatitis query yielded more articles related to veterinary science, biochemistry, cellular and molecular biology; the eczema query linked to public health, infectious disease and respiratory system. Medical Subject Headings terms associated with “AD” or “Eczema” differed, with an agreement between the top 40 lists of 52%. The presence of terms related to cellular mechanisms, especially allergies and inflammation, characterized AD literature. The metabolites mentioned more frequently than expected in articles with AD tag differed from those indexed with eczema. Fewer enriched genes were retrieved when using eczema compared to AD query.Conclusions and Clinical RelevanceThere is a considerable discrepancy when using text mining to extract bio-entities related to eczema or AD. Our results suggest that any systematic approach (particularly when looking for metabolites or genes related to the condition) should be performed using both terms jointly. We propose to use decision tree learning

Journal article

Rodosthenous T, Shahrezaei V, Evangelou M, 2021, Multi-view Data Visualisation via Manifold Learning, Publisher: arXiv

Working paper

Mustafa R, Mens M, Pinto R, Karaman I, Roshchupkin G, Huang J, Elliott P, Evangelou M, Dehghan A, Ghanbari Met al., 2020, Identifying metabolomic fingerprints of microRNAs in cardiovascular disorders, Publisher: SPRINGERNATURE, Pages: 277-277, ISSN: 1018-4813

Conference paper

Evangelou M, Adams N, 2020, An anomaly detection framework for cyber-security data, Computers and Security, Vol: 97, Pages: 1-10, ISSN: 0167-4048

Data-driven anomaly detection systems unrivalled potential as complementary defence systems to existing signature-based tools as the number of cyber attacks increases. In this manuscript an anomaly detection system is presented that detects any abnormal deviations from the normal behaviour of an individual device. Device behaviour is defined as the number of network traffic events involving the device of interest observed within a pre-specified time period. The behaviour of each device at normal state is modelled to depend on its observed historic behaviour. A number of statistical and machine learning approaches are explored for modelling this relationship and through a comparative study, the Quantile Regression Forests approach is found to have the best predictive power. Based on the prediction intervals of the Quantile Regression Forests an anomaly detection system is proposed that characterises as abnormal, any observed behaviour outside of these intervals. A series of experiments for contaminating normal device behaviour are presented for examining the performance of the anomaly detection system. Through the conducted analysis the proposed anomaly detection system is found to outperform two other detection systems. The presented work has been conducted on two enterprise networks.

Journal article

Mustafa R, Mens M, Pinto RJ, Karaman I, Roshchupkin G, Huang J, Elliot P, Evangelou M, Dehghan A, Ghanbari Met al., 2020, Metabolomic signatures of microRNAs in cardiovascular traits: A Mendelian randomization analysis, Annual Meeting of the International-Genetic-Epidemiology-Society, Publisher: WILEY, Pages: 506-506, ISSN: 0741-0395

Conference paper

Lucotte EA, Sugier P-E, Deleuze J-F, Ostroumova E, Boutron M-C, de Vathaire F, Guenel P, Liquet B, Evangelou M, Truong Tet al., 2020, Analysis of the pleiotropy between breast cancer and thyroid cancer, Annual Meeting of the International-Genetic-Epidemiology-Society, Publisher: WILEY, Pages: 504-504, ISSN: 0741-0395

Conference paper

Rodosthenous T, Shahrezaei V, Evangelou M, 2020, Integrating multi-OMICS data through sparse Canonical Correlation Analysis for the prediction of complex traits: A comparison study, Bioinformatics, Vol: 36, Pages: 4616-4625, ISSN: 1367-4803

MotivationRecent developments in technology have enabled researchers to collect multiple OMICS datasets for the same individuals. The conventional approach for understanding the relationships between the collected datasets and the complex trait of interest would be through the analysis of each OMIC dataset separately from the rest, or to test for associations between the OMICS datasets. In this work we show that integrating multiple OMICS datasets together, instead of analysing them separately, improves our understanding of their in-between relationships as well as the predictive accuracy for the tested trait. Several approaches have been proposed for the integration of heterogeneous and high-dimensional (p ≫ n) data, such as OMICS. The sparse variant of Canonical Correlation Analysis (CCA) approach is a promising one that seeks to penalise the canonical variables for producing sparse latent variables while achieving maximal correlation between the datasets. Over the last years, a number of approaches for implementing sparse CCA (sCCA) have been proposed, where they differ on their objective functions, iterative algorithm for obtaining the sparse latent variables and make different assumptions about the original datasets.ResultsThrough a comparative study we have explored the performance of the conventional CCA proposed by Parkhomenko et al. (2009), penalised matrix decomposition CCA proposed by Witten and Tibshirani (2009) and its extension proposed by Suo et al. (2017). The aforementioned methods were modified to allow for different penalty functions. Although sCCA is an unsupervised learning approach for understanding of the in-between relationships, we have twisted the problem as a supervised learning one and investigated how the computed latent variables can be used for predicting complex traits. The approaches were extended to allow for multiple (more than two) datasets where the trait was included as one of the input datasets. Both ways have shown improvement

Journal article

Karhunen V, Jarvelin M-R, Evangelou M, Rodriguez Aet al., 2019, A MENDELIAN RANDOMISATION STUDY ON CAUSALITY BETWEEN ATTENTION-DEFICIT/HYPERACTIVITY DISORDER AND MULTIPLE OBESITY-RELATED TRAITS, 27th World Congress of Psychiatric Genetics (WCPG), Publisher: ELSEVIER, Pages: S114-S115, ISSN: 0924-977X

Conference paper

Riddle-Workman E, Evangelou M, Adams N, 2018, Adaptive anomaly detection on network data streams, IEEE Conference on Intelligence and Security Informatics (ISI) 2018, Publisher: IEEE

As the number of cyber-attacks increases, there hasbeen increasing emphasis on developing complementary methodsof detection to the existing signature-based approaches. This workbuilds upon a previously discovered persistent structure withinthe Los Alamos National Laboratory network data sources,to develop a regression based streaming anomaly detectionmechanism that can adapt to the network behaviour over time.The methodology has also been applied to a new data set of thesame network to assess the extent of its pertinence in time.

Conference paper

Evangelou E, Warren HR, Mosen-Ansorena D, Mifsud B, Pazoki R, Gao H, Ntritsos G, Dimou N, Cabrera CP, Karaman I, Fu LN, Evangelou M, Witkowska K, Tzanis E, Hellwege JN, Giri A, Edwards DRV, Sun YV, Cho K, Gaziano JM, Wilson PWF, Tsao PS, Kovesdy CP, Esko T, Magi R, Milani L, Almgren P, Boutin T, Debette S, Ding J, Giulianini F, Holliday EG, Jackson AU, Li-Gao R, Lin W-Y, Luan J, Mangino M, Oldmeadow C, Prins BP, Qian Y, Sargurupremraj M, Shah N, Surendran P, Theriault S, Verweij N, Willems SM, Zhao J-H, Amouyel P, Connell J, de Mutsert R, Doney ASF, Farrall M, Menni C, Morris AD, Noordam R, Pare G, Poulter NR, Shields DC, Stanton A, Thom S, Abecasis G, Amin N, Arking DE, Ayers KL, Barbieri CM, Batini C, Bis JC, Blake T, Bochud M, Boehnke M, Boerwinkle E, Boomsma DI, Bottinger EP, Braund PS, Brumat M, Campbell A, Campbell H, Chakravarti A, Chambers JC, Chauhan G, Ciullo M, Cocca M, Collins F, Cordell HJ, Davies G, de Borst MH, de Geus EJ, Deary IJ, Deelen J, Del Greco FM, Demirkale CY, Dorr M, Ehret GB, Elosua R, Enroth S, Erzurumluoglu AM, Ferreira T, Franberg M, Franco OH, Gandin I, Gasparini P, Giedraitis V, Gieger C, Girotto G, Goel A, Gow AJ, Gudnason V, Guo X, Gyllensten U, Hamsten A, Harris TB, Harris SE, Hartman CA, Havulinna AS, Hicks AA, Hofer E, Hofman A, Hottenga J-J, Huffman JE, Hwang S-J, Ingelsson E, James A, Jansen R, Jarvelin M-R, Joehanes R, Johansson A, Johnson AD, Joshi PK, Jousilahti P, Jukema JW, Jula A, Kahonen M, Kathiresan S, Keavney BD, Khaw K-T, Knekt P, Knight J, Kolcic I, Kooner JS, Koskinen S, Kristiansson K, Kutalik Z, Laan M, Larson M, Launer LJ, Lehne B, Lehtimaki T, Liewald DCM, Lin L, Lind L, Lindgren CM, Liu Y, Loos RJF, Lopez LM, Lu Y, Lyytikainen L-P, Mahajan A, Mamasoula C, Marrugat J, Marten J, Milaneschi Y, Morgan A, Morris AP, Morrison AC, Munson PJ, Nalls MA, Nandakumar P, Nelson CP, Niiranen T, Nolte IM, Nutile T, Oldehinkel AJ, Oostra BA, O'Reilly PF, Org E, Padmanabhan S, Palmas W, Palotie A, Pattie A, Penninx BWJH, Perolet al., 2018, Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits (vol 50, pg 1412, 2018), NATURE GENETICS, Vol: 50, Pages: 1755-1755, ISSN: 1061-4036

Journal article

Mustafa R, Ghanbari M, Evangelou M, Dehghan Aet al., 2018, An enrichment analysis for cardiometabolic traits suggests non-random assignment of genes to microRNAs, International Journal of Molecular Sciences, Vol: 19, ISSN: 1422-0067

MicroRNAs (miRNAs) regulate the expression of majority of genes. However, it is not known whether they regulate genes in random or are organized according to their function. To this end, we chose cardiometabolic disorders as an example and investigated whether genes associated with cardiometabolic disorders are regulated by a random set of miRNAs or a limited number of them. Single-nucleotide polymorphisms (SNPs) reaching genome-wide level significance were retrieved from most recent genome-wide association studies on cardiometabolic traits, which were cross-referenced with Ensembl to identify related genes and combined with miRNA target prediction databases (TargetScan, miRTarBase, or miRecords) to identify miRNAs that regulate them. We retrieved 520 SNPs, of which 355 were intragenic, corresponding to 304 genes. While we found a higher proportion of genes reported from all GWAS that were predicted targets for miRNAs in comparison to all protein coding genes (75.1%), the proportion was even higher for cardiometabolic genes (80.6%). Enrichment analysis was performed within each database. We found that cardiometabolic genes were over-represented in target genes for 29 miRNAs (based on TargetScan) and 3 miRNAs (miR-181a, miR-302d, and miR-372) (based on miRecords) after Benjamini-Hochberg correction for multiple testing. Our work provides evidence for non-random assignment of genes to miRNAs and supports the idea that miRNAs regulate sets of genes that are functionally related.

Journal article

Evangelou E, Warren HR, Mosen-Ansorena D, Mifsud B, Pazoki R, Gao H, Ntritsos G, Dimou N, Cabrera CP, Karaman I, Fu LN, Evangelou M, Witkowska K, Tzanis E, Hellwege JN, Giri A, Edwards DRV, Sun YV, Cho K, Gaziano JM, Wilson PWF, Tsao PS, Kovesdy CP, Esko T, Magi R, Milani L, Almgren P, Boutin T, Debette S, Ding J, Giulianini F, Holliday EG, Jackson AU, Li-Gao R, Lin W-Y, Luan J, Mangino M, Oldmeadow C, Prins BP, Qian Y, Sargurupremraj M, Shah N, Surendran P, Theriault S, Verweij N, Willems SM, Zhao J-H, Amouyel P, Connell J, de Mutsert R, Doney ASF, Farrall M, Menni C, Morris AD, Noordam R, Pare G, Poulter NR, Shields DC, Stanton A, Thom S, Abecasis G, Amin N, Arking DE, Ayers KL, Barbieri CM, Batini C, Bis JC, Blake T, Bochud M, Boehnke M, Boerwinkle E, Boomsma DI, Bottinger EP, Braund PS, Brumat M, Campbell A, Campbell H, Chakravarti A, Chambers JC, Chauhan G, Ciullo M, Cocca M, Collins F, Cordell HJ, Davies G, de Borst MH, de Geus EJ, Deary IJ, Deelen J, Del Greco FM, Demirkale CY, Dorr M, Ehret GB, Elosua R, Enroth S, Erzurumluoglu AM, Ferreira T, Franberg M, Franco OH, Gandin I, Gasparini P, Giedraitis V, Gieger C, Girotto G, Goel A, Gow AJ, Gudnason V, Guo X, Gyllensten U, Hamsten A, Harris TB, Harris SE, Hartman CA, Havulinna AS, Hicks AA, Hofer E, Hofman A, Hottenga J-J, Huffman JE, Hwang S-J, Ingelsson E, James A, Jansen R, Jarvelin M-R, Joehanes R, Johansson A, Johnson AD, Joshi PK, Jousilahti P, Jukema JW, Jula A, Kahonen M, Kathiresan S, Keavney BD, Khaw K-T, Knekt P, Knight J, Kolcic I, Kooner JS, Koskinen S, Kristiansson K, Kutalik Z, Laan M, Larson M, Launer LJ, Lehne B, Lehtimaki T, Liewald DCM, Lin L, Lind L, Lindgren CM, Liu Y, Loos RJF, Lopez LM, Lu Y, Lyytikainen L-P, Mahajan A, Mamasoula C, Marrugat J, Marten J, Milaneschi Y, Morgan A, Morris AP, Morrison AC, Munson PJ, Nalls MA, Nandakumar P, Nelson CP, Niiranen T, Nolte IM, Nutile T, Oldehinkel AJ, Oostra BA, O'Reilly PF, Org E, Padmanabhan S, Palmas W, Palotie A, Pattie A, Penninx BWJH, Perolet al., 2018, Publisher correction: Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits, Nature Genetics, Vol: 50, Pages: 1755-1755, ISSN: 1061-4036

Correction to: Nature Genetics https://doi.org/10.1038/s41588-018-0205-x, published online 17 September 2018.

Journal article

Evangelou E, Warren HR, Mosen-Ansorena D, Mifsud B, Pazoki R, Gao H, Ntritsos G, Dimou N, Cabrera CP, Karaman I, Fu LN, Evangelou M, Witkowska K, Tzanis E, Hellwege JN, Giri A, Edwards DRV, Sun YV, Cho K, Gaziano JM, Wilson PWF, Tsao PS, Kovesdy CP, Esko T, Magi R, Milani L, Almgren P, Boutin T, Debette S, Ding J, Giulianini F, Holliday EG, Jackson AU, Li-Gao R, Lin W-Y, Luan J, Mangino M, Oldmeadow C, Prins BP, Qian Y, Sargurupremraj M, Shah N, Surendran P, Theriault S, Verweij N, Willems SM, Zhao J-H, Amouyel P, Connell J, de Mutsert R, Doney ASF, Farrall M, Menni C, Morris AD, Noordam R, Pare G, Poulter NR, Shields DC, Stanton A, Thom S, Abecasis G, Amin N, Arking DE, Ayers KL, Barbieri CM, Batini C, Bis JC, Blake T, Bochud M, Boehnke M, Boerwinkle E, Boomsma DI, Bottinger EP, Braund PS, Brumat M, Campbell A, Campbell H, Chakravarti A, Chambers JC, Chauhan G, Ciullo M, Cocca M, Collins F, Cordell HJ, Davies G, de Borst MH, de Geus EJ, Deary IJ, Deelen J, Del Greco FM, Demirkale CY, Dorr M, Ehret GB, Elosua R, Enroth S, Erzurumluoglu AM, Ferreira T, Franberg M, Franco OH, Gandin I, Gasparini P, Giedraitis V, Gieger C, Girotto G, Goel A, Gow AJ, Gudnason V, Guo X, Gyllensten U, Hamsten A, Harris TB, Harris SE, Hartman CA, Havulinna AS, Hicks AA, Hofer E, Hofman A, Hottenga J-J, Huffman JE, Hwang S-J, Ingelsson E, James A, Jansen R, Jarvelin M-R, Joehanes R, Johansson A, Johnson AD, Joshi PK, Jousilahti P, Jukema JW, Jula A, Kahonen M, Kathiresan S, Keavney BD, Khaw K-T, Knekt P, Knight J, Kolcic I, Kooner JS, Koskinen S, Kristiansson K, Kutalik Z, Laan M, Larson M, Launer LJ, Lehne B, Lehtimaki T, Liewald DCM, Lin L, Lind L, Lindgren CM, Liu Y, Loos RJF, Lopez LM, Lu Y, Lyytikainen L-P, Mahajan A, Mamasoula C, Marrugat J, Marten J, Milaneschi Y, Morgan A, Morris AP, Morrison AC, Munson PJ, Nalls MA, Nandakumar P, Nelson CP, Niiranen T, Nolte IM, Nutile T, Oldehinkel AJ, Oostra BA, O'Reilly PF, Org E, Padmanabhan S, Palmas W, Palotie A, Pattie A, Penninx BWJH, Perolet al., 2018, Genetic analysis of over one million people identifies 535 new loci associated with blood pressure traits, Nature Genetics, Vol: 50, Pages: 1412-1425, ISSN: 1061-4036

High blood pressure is a highly heritable and modifiable risk factor for cardiovascular disease. We report the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to date in over 1 million people of European ancestry. We identify 535 novel blood pressure loci that not only offer new biological insights into blood pressure regulation but also highlight shared genetic architecture between blood pressure and lifestyle exposures. Our findings identify new biological pathways for blood pressure regulation with potential for improved cardiovascular disease prevention in the future.

Journal article

Warren HR, Evangelou E, Mosen D, Mifsud B, Pazoki R, Gao H, Ntritsos G, Dimou N, Evangelou M, Hellwege J, Giri A, Esko T, Metspalu A, Tzoulaki I, Barnes MR, Wain LV, Elliott P, Caulfield MJet al., 2018, GENETIC ANALYSIS OF OVER ONE MILLION PEOPLE IDENTIFIES 535 NOVEL LOCI ASSOCIATED WITH BLOOD PRESSURE AND RISK OF CARDIOVASCULAR DISEASE, 28th European Meeting of Hypertension and Cardiovascular Protection of the European-Society-of-Hypertension (ESH), Publisher: LIPPINCOTT WILLIAMS & WILKINS, Pages: E229-E229, ISSN: 0263-6352

Conference paper

Broc C, Evangelou M, Truong T, Liquet Bet al., 2018, Investigating gene- and pathway-environment Interaction analysis approaches, Journal of the French Statistical Society, ISSN: 1962-5197

Pathway analysis can increase power to detect associations with a gene or a pathway by combining severalsignals at the single nucleotide polymorphism (SNP)-level into a single test. In this work, we propose to extend twowell-known self-contained methods, the Fisher’s method (FM) and the Adaptive Rank Truncated Product (ARTP)method to the analysis of gene-environment (GxE) interaction at the gene and pathway-level. It has been previouslysuggested that the permutation procedures that are usually used to derive the significance of these tests are notappropriate for the analysis of GxE interaction and should be replaced by a bootstrap approach. We analyse andcompare the performance of the extension of FM and ARTP using the permutation and the parametric bootstrapprocedure in simulation studies. We illustrate its application by analysing the interaction between night work andcircadian gene polymorphisms in the risk of breast cancer in a case-control study. The ARTP method, adapted for bothgene- and pathway-environment interactions, gives promising results and has been wrapped to the R package PIGEavailable on the CRAN.

Journal article

Schon C, Adams NM, Evangelou M, 2017, Clustering and monitoring edge behaviour in enterprise network traffic, IEEE International Conference on Intelligence and Security Informatics, Publisher: IEEE, Pages: 31-36

This paper takes an unsupervised learning approach for monitoring edge activity within an enterprise computer network. Using NetFlow records, features are gathered across the active connections (edges) in 15-minute time windows. Then, edges are grouped into clusters using the k-means algorithm. This process is repeated over contiguous windows. A series of informative indicators are derived by examining the relationship of edges with the observed cluster structure. This leads to an intuitive method for monitoring network behaviour and a temporal description of edge behaviour at global and local levels.

Conference paper

Gibberd AJ, Evangelou M, Nelson JDB, 2017, The time-varying dependency patterns of NetFlow statistics, IEEE International Conference on Data Mining Workshop Proceedings, Publisher: IEEE

We investigate where and how key dependency structure between measures of network activity change throughout the course of daily activity. Our approach to data-mining is probabilistic in nature, we formulate the identification of dependency patterns as a regularised statistical estimation problem. The resulting model can be interpreted as a set of time-varying graphs and provides a useful visual interpretation of network activity. We believe this is the first application of dynamic graphical modelling to network traffic of this kind. Investigations are performed on 9 days of real-world network traffic across a subset of IP's. We demonstrate that dependency between features may change across time and discuss how these change at an intra and inter-day level. Such variation in feature dependency may have important consequences for the design and implementation of probabilistic intrusion detection systems.

Conference paper

Evangelou M, Adams N, 2016, Predictability of NetFlow data, IEEE International Conference on Intelligence and Security Informatics, Publisher: IEEE

The behaviour of individual devices connected to anenterprise network can vary dramatically, as a device’s activitydepends on the user operating the device as well as on all behindthe scenes operations between the device and the network. Beingable to understand and predict a device’s behaviour in a networkcan work as the foundation of an anomaly detection framework,as devices may show abnormal activity as part of a cyber attack.The aim of this work is the construction of a predictive regressionmodel for a device’s behaviour at normal state. The behaviourof a device is presented by a quantitative response and modelledto depend on historic data recorded by NetFlow.

Conference paper

Whitehouse M, Evangelou M, Adams N, 2016, Activity-based temporal anomaly detection in enterprise-cyber security, IEEE International Big Data Analytics for Cybersecurity computing (BDAC'16) Workshop, IEEE International Conference on Intelligence and Security Informatics, Publisher: IEEE

Statistical anomaly detection is emerging as animportant complement to signature-based methods for enterprisenetwork defence. In this paper, we isolate a persistent structurein two different enterprise network data sources. This structureprovides the basis of a regression-based anomaly detectionmethod. The procedure is demonstrated on a large public domaindata set.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00428900&limit=30&person=true