Arbalet

Table of Contents

What is Arbalet ?

Arbalet is a Java tool to view and manipulate phylogenetic trees and compute their Parsimony score.

With Arbalet, given a file of taxa and a tree in Newick format you can:

Download Arbalet

You can download the following version:

Examples with one file of sequences (zilla) and two trees in newick format:

Source code is available on sourceforge.net

Tutorial

Please follow this link toget access to the tutorial.

File formats

For sequences you can use one of the following format:

For trees, use the Newick format:

How can we compare trees ?

Ribeiro and Vianna implementation

We use a derivative form of Path-Relinking (PR) introduced by Glover and Laguna (see CiteSeerX). PR is a Metaheuristic (see Wikipedia for a definition) with an intensification strategy used to explore elite solutions.

PR consider two solutions called source and guiding solutions and consists in transforming the source solution into the guiding solution by applying a series of modifications. After each modification the current solution is duplicated and an exploration phase is performed on a copy of the solution.

If we remove the exploration phase we then have a modification algorithm from one tree to another.

For the Maximum Parsimony problem an implementation was given by Ribeiro and Vianna. This implementation can be called top-down implementation as it starts from the root of the tree and recursively explore each left and right subtrees. For each iteration we compare the taxa in the left and right subtrees of the source and guiding solution. All taxa of the source left (resp. right) subtree that are in the right (resp. left) subtree of the guiding solution are moved to the right (resp. left) subtree of the of the source solution.

The major drawback of this implementation is that it requires a lot of modifications and moves of the taxa from left to right (resp. right to left) subtrees. When a taxon is degraphed it is then regraphed on a branch of the sibling subtree that minimizes the parsmony score of the tree. We can also put the degraphed taxon at the top of the sibling subtree to avoid the search of a minimum tree.

Richer and Vazquez-Ortiz implementation

We have implemented a bottom-up implementation which compares the subtrees present in the source and guiding solutions and starts with subtrees of size 1, size 2, then size 3 and so on. When a subtree (X,Y) of the guiding solution is not found in the source solution then we modify the source tree by degraphing Y and regraphing it on X.

For example, with two trees of zilla (500 taxa) of score 16218 and 16219 we have:

Here are some more results:

 guiding / source   BU
Transform. 
 BU
Time (s) 
 TD1
Transform. 
 TD1
Time (s) 
 TD2
Transform. 
 TD2
Time (s) 
 16218 / 162182   24   0.10   118   0.33   74   0.85 
 16218 / 16219    32   0.14   462   1.36   343   4.08 
 16218 / 16250    97   0.40   1779   4.79   1519   25.32 
 16218 / 16401   151   0.60   1891   5.16   1474   25.36 
 16218 / 21727   446   1.68   1881   5.17   1241   18.70 
Comparison of Path-Relinking implementations in terms of number of transformations and execution time on an Intel i5-4570 CPU at 3.20GHz

As we can see the Bottom-Up (BU) implementation requires a smaller number of transformations to transform one tree into another. The execution time is the most important for TD2 because we need to find the score of each regraph on every branch and keep the one that minimizes the parsimony score of the tree.

More information

Arbalet is an acronym that comme from Arbre (the french term for tree) and alet (the contraction of applet) as it was initially supposed to be an Applet.

The contributors to this project are:

Kévin, Florian, Clément and Marc did the main job, we have added some minor (but interesting) capabilities to the final project (like tree comparison with Path-Relinking, search and visualisation of a taxon in the tree) and killed some bugs.

Bug report