Imperial College London


Faculty of MedicineDepartment of Infectious Disease

Honorary Senior Lecturer



+44 (0)20 7594 1930l.coin




172Medical SchoolSt Mary's Campus





I am  working on various statistical and mathematical problems in genomics. I am particularly interested in building mathematical models to identify genetic variants in high-throughput genomics data - including genotyping microarrays and next generation sequence data - with the ultimate aim of understanding the functional impact and evolutionary history of these variants.

Software can be found on my github page:

More information:

ECCB 2012 report


Yhap  software for identifying haplogroups from low coverage sequence data: Yhap_0.51

ExomeCNVTest software : ExomeCNVTest_0.51

SOAP-popIndel  software for genotyping indels in exome data


cnvHiTSeq - software for detecting and genotyping CNVs in WGS data: cnvHitSeq

Toy example for cnvHitSeq

cnvPipe - software to enable CNV meta analysis: cnvPipe_v0.82

Software for converting  IMPUTE format to format used by MultiPhen software: convertImpute

MultiPhen software: MultiPhen

vntrTest is a program for assessing association of VNTR fragment length genotypes with either continuous or case-control outcomes. 


cnvHap is a program for joint copy number genotyping, which uses a haplotype model of copy number variation and integrates data from multiple platforms. It also carries out CN association.


polyHap is a program for phasing polyploids and copy number regions.

See for more details.

The first version was designed just for phasing polyploid regions (with the restriction that the ploidy is fixed across the entire region of analysis).


We have extended polyHap to remove this restriction, so that it can phase CNV regions (from pre-calculated CNV/SNP genotypes):


AncesHC is a program for determining the haplotype structure of a population sample from genotype data, and then testing for association of these haplotypes with either a binary or continous outcome.

See for more details.


metaMapper is a program for flexible, scalable GWAS meta-analysis and visualisation.


Software for simulating sequence level data with inversions. See for more details. Developed in conjunction with Clive Hoggart and Paul O'Reilly.



Pseudogene inference from loss of constraint (PSILC)
Software for identifying pseudogenes via loss of evolutionary constraint:

PSILC version 1.21

Supplementary data
Supplementary information from paper "Pathway Analysis of GWAS Provides New Insights into Genetic Susceptibility to 3 Inflammatory Diseases"



Gliddon H, Kaforou M, Alikian M, et al., 2021, Identification of reduced host transcriptomic signatures for tuberculosis disease and digital PCR-based validation and quantification, Frontiers in Immunology, Vol:12, ISSN:1664-3224

Nguyen SH, Cao MD, Coin LJM, et al., 2021, Real-time resolution of short-read assembly graph using ONT long reads, Plos Computational Biology, Vol:17, ISSN:1553-734X

Murigneux V, Rai SK, Furtado A, et al., 2020, Comparison of long-read methods for sequencing and assembly of a plant genome, Gigascience, Vol:9, ISSN:2047-217X

Borghesi A, Trück J, Asgari S, et al., 2020, Whole-exome sequencing for the identification of rare variants in primary immunodeficiency genes in children with sepsis - a prospective population-based cohort study., Clinical Infectious Diseases, Vol:71, ISSN:1058-4838, Pages:e614-e623

Zhou C, Olukolu B, Gemenet DC, et al., 2020, Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outbred mapping populations, Nature Genetics, Vol:52, ISSN:1061-4036, Pages:1256-+

More Publications