Citation

BibTex format

@article{Penn:2025:sysbio/syae030,
author = {Penn, MJ and Scheidwasser, N and Khurana, MP and Duchêne, DA and Donnelly, CA and Bhatt, S},
doi = {sysbio/syae030},
journal = {Syst Biol},
pages = {250--266},
title = {Phylo2Vec: A Vector Representation for Binary Trees.},
url = {http://dx.doi.org/10.1093/sysbio/syae030},
volume = {74},
year = {2025}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with n leaves to a unique integer vector of length n-1. The advantages of Phylo2Vec are 4-fold: (i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, (iii) quick and unambiguous verification if 2 binary trees are identical topologically, and (iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for ML inference on 5 real-world datasets and show that a simple hill-climbing-based optimization scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.
AU - Penn,MJ
AU - Scheidwasser,N
AU - Khurana,MP
AU - Duchêne,DA
AU - Donnelly,CA
AU - Bhatt,S
DO - sysbio/syae030
EP - 266
PY - 2025///
SP - 250
TI - Phylo2Vec: A Vector Representation for Binary Trees.
T2 - Syst Biol
UR - http://dx.doi.org/10.1093/sysbio/syae030
UR - https://www.ncbi.nlm.nih.gov/pubmed/38935520
VL - 74
ER -

Contact us


For any enquiries related to the MRC Centre please contact:

Scientific Manager
Susannah Fisher
mrc.gida@imperial.ac.uk

External Relationships and Communications Manager
Dr Sabine van Elsland
s.van-elsland@imperial.ac.uk