In this paper, we systematically assessed the sensitivity of an alignment-free phylogenetic approach using D2 statistics and neighbour-joining to various biological scenarios. In comparison to the standard phylogenetic approach using multiple sequence alignment, this approach is more robust to insertions/deletions and rearrangement, and is more scalable, with up to over 1000-fold faster. This study focuses mainly on the scale of gene sequences, it would be interesting to see how alignment-free approaches pan out on genome-scale sequences.
Our implementation of D2 statistics in generating pairwise distances among a set of sequences is available as a Java package: jD2Stat.
Inferring phylogenies of evolving sequences without multiple sequence alignment
- Evolutionary biologists grow “tree of life” faster than ever before (UQ IMB, 2 Oct 2014)
- A mathematical perspective for evolutionary biologists (Australian LifeScientist, 6 Oct 2014)