Imperial College London


Faculty of Natural SciencesDepartment of Mathematics

Visiting Professor



+44 (0)20 7594 2647c.colijn Website




626Huxley BuildingSouth Kensington Campus






BibTex format

author = {Hatherell, H-A and Colijn, C and Stagg, HR and Jackson, C and Winter, JR and Abubakar, I},
doi = {10.1186/s12916-016-0566-x},
journal = {BMC Medicine},
title = {Interpreting whole genome sequencing for investigating tuberculosis transmission: a systematic review},
url = {},
volume = {14},
year = {2016}

RIS format (EndNote, RefMan)

AB - BackgroundWhole genome sequencing (WGS) is becoming an important part of epidemiological investigations of infectious diseases due to greater resolution and cost reductions compared to traditional typing approaches. Many public health and clinical teams will increasingly use WGS to investigate clusters of potential pathogen transmission, making it crucial to understand the benefits and assumptions of the analytical methods for investigating the data. We aimed to understand how different approaches affect inferences of transmission dynamics and outline limitations of the methods.MethodsWe comprehensively searched electronic databases for studies that presented methods used to interpret WGS data for investigating tuberculosis (TB) transmission. Two authors independently selected studies for inclusion and extracted data. Due to considerable methodological heterogeneity between studies, we present summary data with accompanying narrative synthesis rather than pooled analyses.ResultsTwenty-five studies met our inclusion criteria. Despite the range of interpretation tools, the usefulness of WGS data in understanding TB transmission often depends on the amount of genetic diversity in the setting. Where diversity is small, distinguishing re-infections from relapses may be impossible; interpretation may be aided by the use of epidemiological data, examining minor variants and deep sequencing. Conversely, when within-host diversity is large, due to genetic hitchhiking or co-infection of two dissimilar strains, it is critical to understand how it arose. Greater understanding of microevolution and mixed infection will enhance interpretation of WGS data.ConclusionsAs sequencing studies have sampled more intensely and integrated multiple sources of information, the understanding of TB transmission and diversity has grown, but there is still much to be learnt about the origins of diversity that will affect inferences from these data. Public health teams and researchers should comb
AU - Hatherell,H-A
AU - Colijn,C
AU - Stagg,HR
AU - Jackson,C
AU - Winter,JR
AU - Abubakar,I
DO - 10.1186/s12916-016-0566-x
PY - 2016///
SN - 1741-7015
TI - Interpreting whole genome sequencing for investigating tuberculosis transmission: a systematic review
T2 - BMC Medicine
UR -
UR -
VL - 14
ER -