Citation

BibTex format

@article{Ndovie:2025:10.1128/msystems.01661-24,
author = {Ndovie, W and Havránek, J and Leconte, J and Koszucki, J and Chindelevitch, L and Adriaenssens, EM and Mostowy, RJ},
doi = {10.1128/msystems.01661-24},
journal = {mSystems},
title = {Exploration of the genetic landscape of bacterial dsDNA viruses reveals an ANI gap amid extensive mosaicism.},
url = {http://dx.doi.org/10.1128/msystems.01661-24},
volume = {10},
year = {2025}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Average nucleotide identity (ANI) is a widely used metric to estimate genetic relatedness, especially in microbial species delineation. While ANI calculation has been well optimized for bacteria and closely related viral genomes, accurate estimation of ANI below 80%, particularly in large reference data sets, has been challenging due to a lack of accurate and scalable methods. To bridge this gap, we introduce MANIAC, an efficient computational pipeline optimized for estimating ANI and alignment fraction (AF) in viral genomes with divergence around ANI of 70%. Using a rigorous simulation framework, we demonstrate MANIAC's accuracy and scalability compared to existing approaches, even to data sets of hundreds of thousands of viral genomes. Applying MANIAC to a curated data set of complete bacterial dsDNA viruses revealed a multimodal ANI distribution, with a distinct gap around 80%, akin to the bacterial ANI gap (~90%) but shifted, likely due to viral-specific evolutionary processes such as recombination dynamics and mosaicism. We then evaluated ANI and AF as predictors of genus-level taxonomy using a logistic regression model. We found that this model has strong predictive power (PR-AUC = 0.981), but that it works much better for virulent (PR-AUC = 0.997) than temperate (PR-AUC = 0.847) bacterial viruses. This highlights the complexity of taxonomic classification in temperate phages, known for their extensive mosaicism, and cautions against over-reliance on ANI in such cases. MANIAC can be accessed at https://github.com/bioinf-mcb/MANIAC.IMPORTANCEWe introduce a novel computational pipeline called MANIAC, designed to accurately assess average nucleotide identity (ANI) and alignment fraction (AF) between diverse viral genomes, scalable to data sets of over 100k genomes. Using computer simulations and real data analyses, we show that MANIAC could accurately estimate genetic relatedness between pairs of viral genomes of around 60%-70% ANI. We applied MANIAC to investiga
AU - Ndovie,W
AU - Havránek,J
AU - Leconte,J
AU - Koszucki,J
AU - Chindelevitch,L
AU - Adriaenssens,EM
AU - Mostowy,RJ
DO - 10.1128/msystems.01661-24
PY - 2025///
TI - Exploration of the genetic landscape of bacterial dsDNA viruses reveals an ANI gap amid extensive mosaicism.
T2 - mSystems
UR - http://dx.doi.org/10.1128/msystems.01661-24
UR - https://www.ncbi.nlm.nih.gov/pubmed/39878503
VL - 10
ER -

Contact us


For any enquiries related to the MRC Centre please contact:

Scientific Manager
Susannah Fisher
mrc.gida@imperial.ac.uk

External Relationships and Communications Manager
Dr Sabine van Elsland
s.van-elsland@imperial.ac.uk