Imperial College London

Claire L. Shovlin PhD FRCP

Faculty of MedicineNational Heart & Lung Institute

Professor of Practice (Clinical and Molecular Medicine)
 
 
 
//

Contact

 

c.shovlin Website

 
 
//

Location

 

534Block L Hammersmith HospitalHammersmith Campus

//

Summary

 

Publications

Citation

BibTex format

@unpublished{Xiao:2020:10.1101/2020.03.30.20047209,
author = {Xiao, S and Kai, Z and Brown, D and Shovlin, C and Genomics, England Research Consortium},
doi = {10.1101/2020.03.30.20047209},
publisher = {MedRxiv},
title = {Harnessing the 100,000 Genomes Project whole genome sequencing data - an unbiased systematic tool to filter by biologically validated regions of functionality},
url = {http://dx.doi.org/10.1101/2020.03.30.20047209},
year = {2020}
}

RIS format (EndNote, RefMan)

TY  - UNPB
AB - Whole genome sequencing (WGS) is championed by the UK National Health Service (NHS) to identify genetic variants that cause particular diseases. The full potential of WGS has yet to be realised as early data analytic steps prioritise protein-coding genes, and effectively ignore the less well annotated non-coding genome which is rich in transcribed and critical regulatory regions. To address, we developed a filter, which we call GROFFFY, and validated in WGS data from hereditary haemorrhagic telangiectasia patients within the 100,000 Genomes Project. Before filter application, the mean number of DNA variants compared to human reference sequence GRCh38 was 4,867,167 (range 4,786,039-5,070,340), and one-third lay within intergenic areas. GROFFFY removed a mean of 2,812,015 variants per DNA. In combination with allele frequency and other filters, GROFFFY enabled a 99.56% reduction in variant number. The proportion of intergenic variants was maintained, and no pathogenic variants in disease genes were lost. We conclude that the filter applied to NHS diagnostic samples in the 100,000 Genomes pipeline offers an efficient method to prioritise intergenic, intronic and coding gDNA variants. Reducing the overwhelming number of variants while retaining functional genome variation of importance to patients, enhances the near-term value of WGS in clinical diagnostics.
AU - Xiao,S
AU - Kai,Z
AU - Brown,D
AU - Shovlin,C
AU - Genomics,England Research Consortium
DO - 10.1101/2020.03.30.20047209
PB - MedRxiv
PY - 2020///
TI - Harnessing the 100,000 Genomes Project whole genome sequencing data - an unbiased systematic tool to filter by biologically validated regions of functionality
UR - http://dx.doi.org/10.1101/2020.03.30.20047209
UR - http://hdl.handle.net/10044/1/82024
ER -