ClustalW, Muscle, and T-Coffee are progressive alignment algorithms. Progressive alignments generally build a guide tree that represents the pairwise relationships between each possible pair of sequences in the alignment. A multiple sequence alignment is then built sequentially using the tree as a construction guide.
You can display this Guide tree for ClustalW alignments and T-Coffee alignments. You cannot show a Guide Tree with a Muscle alignment.
All three algorithms are integrated into the MSA editor. This means you can try all three algorithms on the same alignment to see the results.
ClustalW aligns sequences in two stages. The first, pairwise, stage compares each sequence individually with every other sequence. The Pairwise Alignment parameters control the speed and sensitivity of this stage. The second, multiple alignment, stage progressively merges the sequences, starting with the pair that scored highest in the pairwise stage. The Multiple Alignment parameters control this stage.
The parameters that can be set within each panel vary depending on the type of sequences being aligned and the alignment speed that is selected. To perform your alignment, set parameters in each panel according to your sequence type and alignment requirements.
The ClustalW Alignment dialog box is displayed.
- Slow means that the initial pairwise alignments are performed using a full dynamic programming algorithm
- Fast uses the Wilbur & Lipman (1983) method.
- enter a value between 0 and 100 in the Open Gap Penalty text box. This is the score that is subtracted when a gap is inserted in an alignment. Increasing the gap opening penalty makes gaps less frequent
- enter a value between 0 and 10 in the Extend Gap Penalty text box. This is the penalty for extending the gap by 1 residue. Increasing the extend gap penalty makes gaps shorter. Terminal gaps are not penalized.
- choose a value from the Ktuple drop-down menu. This is the number of consecutive residues that must match the query sequence exactly before MacVector will attempt to score the matching region. Increasing the value speeds the alignment, but if the value is too large, significant homology between sequences may be missed. Throughout MacVector, Ktuple size is also referred to as hash size or word size
- enter a value between 1 and 500 in the Gap Penalty text box. This is the penalty for each gap. The parameter has little effect on speed or sensitivity unless extreme values are entered
- enter a value between 1 and 50 in the Top Diagonals text box. This value determines the number of Ktuple matches on each diagonal in an imaginary dot-matrix plot. A large value increases sensitivity, a small value makes the alignment faster
- enter a value between 1 and 50 in the Window Size text box. This value specifies the window size around each top diagonal. Diagonals that fall inside this window are used in the alignment. Increasing the windows size results in a more sensitive, but slower, alignment. Decreasing the window size increases the speed, but may result in small regions of homology being missed.
- weighted to give higher weightings to transitions (A <-> G, T <-> C) compared to transversions (A <-> T, A <-> C, G <-> T, G <-> C)
- unweighted to treat transitions and transversions equally.
NOTE: This option is only present for nucleic acid alignments.
If you are aligning protein sequences, you have a number of additional parameters to set.
- put the mouse pointer over the arrow at the left of the sequence you want to move
- hold the mouse button down and drag the sequence to the required position
- release the mouse button
A dialog box is displayed, informing you of the progress of the alignment. When the alignment is complete, the ClustalW Alignment Results dialog box and the Multiple Sequence Alignment Editor window are displayed.