MacVector offers two modes of phylogenetic analysis, depending on the type of output required.
- The Best Tree mode calculates the best tree using a given method.
- The Bootstrap mode repeatedly resamples the data, generating phylogenies from each of the new data sets. A consensus tree of these phylogenies indicates how reproducible the Best Tree analysis is.
This section describes how you can generate a phylogenetic tree from any alignment of four or more nucleotide or protein sequences. You can limit the analysis to a subset of the alignment, if required. It then describes how you can check the reliability of this tree, by running a bootstrap analysis to display a bootstrap consensus tree for the same data. It also describes how to display an existing tree.
There are two stages in the generation of trees: calculation of the distances between pairs of sequences, and reconstruction of the phylogeny using the distance information. All building methods proceed by joining the two most similar sequences first, and then adding the other sequences one by one, in order of decreasing similarity.
1. With the multiple alignment in an active MSA Editor window, select the phylogenetic tree button on the toolbar. Alternatively, choose Analyze | Phylogenetic Analyses | Reconstruct Phylogeny. The Phylogenetic Reconstruction dialog box is displayed.
2. Use the Tree Building Method drop-down menu to choose a method of building trees.
- Neighbor Joining is the default. This method makes no assumptions about rates of divergence in different lineages
- The UPGMA method assumes that sequences have diverged at a constant rate.
3. To specify the method for resolving ties, select the Options button in the Tree Building Method panel. The Tree Building Method Options dialog box is displayed; select either the Systematic or Random radio button. When there are two nodes or sequences equidistant from the current node in the tree:
- the Systematic method resolves the tie by adding the nodes in the order of the aligned sequences. This is the default method.
- the Random method chooses the order randomly.
4. Use the Distance drop-down menu to choose a method of calculating the pairwise distances between sequences.
For nucleotide sequences the options are:
- Absolute (# differences)
- Uncorrected ("p")
- Jukes-Cantor
- Tajima-Nei
- Kimura 2-parameter
- Tamura-Nei (default)
- LogDet/Paralinear
For protein sequences the options are:
- Absolute (# differences)
- Uncorrected ("p") (default)
- Poisson-correction
5. To set the distance options, select the Options button in the Distance panel.
The Distance Options dialog box is displayed. Options that are not relevant for your chosen Distance method will be grayed out and unavailable.
- In the Gamma correction panel, use the radio buttons to turn gamma correction Off or On, and enter a Gamma Shape Parameter value in the text box if required. Where gamma correction is available, the default value is 0.5
- In the Transition:Transversion Ratio panel, select the Estimate radio button to have the ratio estimated from your sequence data, or select User defined value and enter the required value in the text box. Where available, the default value is 1.0
- In the Treatment of gaps panel, select the radio button either to Ignore all sites containing gaps or to Distribute proportionally. The default is to Ignore all sites with gaps
Select OK to confirm the settings and close the dialog box.
You can use the Defaults button to restore the default settings.
6. In the Mode panel, select Best tree.
7. By default, all sequences will be included in the tree, and they are all highlighted in the Sequences to Include list. If necessary, you can edit this list:
- To exclude a highlighted sequence from the tree, hold down the Shift key and click on the sequence in the list
- To include a sequence that is not highlighted, hold down the Shift key and click on the sequence
- To include all sequences, select the Select All button
- To exclude all sequences, select the Select None button.
8. Select OK to confirm the settings and perform the analysis.
Alternatively, you can use the Defaults button to restore the default settings, or the Cancel button to close the dialog without retaining the settings. During the analysis an information box is displayed, showing a progress bar for each calculation stage. When it is complete, the resulting tree is displayed in a Tree Viewer window. You can control its appearance interactively (see The Tree Viewer window help topic).
INVALID DISTANCE warning
In some situations, no valid distances can be calculated for a pair of sequences. When this happens, a warning dialog box is displayed. If you continue, then for each distance that cannot be determined, MacVector will assign the largest distance that was calculated in the matrix.
1. With the tree displayed, select the recalculate tree button in the Tree Viewer window.
The Phylogenetic Reconstruction dialog box is displayed.
2. In the Mode panel, select the Bootstrap radio button.
The Number of Replications text box becomes active.
3. Type in the required Number of Replications.
The default value is 1000. Reducing this number gives faster but less reliable results.
4. Select OK to perform the analysis.
After the analysis is complete, the consensus bootstrap tree is displayed in the Tree Viewer window. The consensus bootstrap tree only includes nodes that appear consistently when the data is repeatedly resampled. Nodes that occur very frequently in the resampled data are labeled with their percentage occurrence. You can adjust the cutoff points for displaying and labeling nodes ( see the Setting the tree display options). The topology of the bootstrap consensus tree will not always match that of the best tree. For more details of the method and its interpretation, see Chapter 17,"Understanding Sequence Comparisons", in the MacVector User Guide.
5. To display the original tree, select the recalculate tree button and repeat the calculation using Best tree mode.
When a phylogenetic tree has been calculated, it can be saved along with its associated multiple alignment file, and retrieved when the file is opened.
TO DISPLAY an existing phylogenetic tree
1. With the required alignment in an active MSA Editor window, select the phylogenetic tree button on the toolbar.
The tree is displayed in a Tree Viewer window.
The ClustalW and The T-Coffee alignment algorithms generate a guide tree which is then used to determine the order in which sequences are aligned. This may not be a true phylogenetic tree, because it is based on local pairwise alignments. However, the tree's appearance may sometimes help you to judge the reliability of an alignment. Guide trees are saved with their associated multiple alignment file, and you can view them when the file is retrieved.
1. With the MSA in an active MSA Editor window, choose Analyze | Create Alignment.
The Alignment Views dialog box is displayed.
2. In the Picture Display panel, select the Guide Tree checkbox, and select OK.
The guide tree is displayed in a Tree Viewer window.