To run this analysis, at least two protein sequences must be open, a window containing a protein sequence must be the active window, and a protein scoring matrix file must be open or available to the program.
The output is a two-dimensional plot, with the residues of one sequence along the X-axis, and the residues of the comparison along the Y-axis.
1. Choose Analyze | Pustell Protein Matrix.
The Protein and DNA Matrix Analysis dialog box is displayed.
2. In the left-hand scrolling list, select a sequence to display along the X-axis.
3. In the right-hand scrolling list, select a sequence to display along the Y-axis.
The name that is highlighted in each list is the sequence assigned to that axis.
4. You can limit the analysis to a region of each of the sequences by typing in the numbers that bracket the region in the X-Region and Y-Region text boxes, or by selecting a region from the features table drop-down menu that appears to the right of the text boxes.
If you are using a very long nucleic acid sequence for the comparison, the procedure will need less memory if you display the shorter sequence on the x-axis.
5. Select the Scoring Matrix button to choose a scoring matrix file.
A standard dialog box appears, enabling you to search for and select the file to use.
6. Enter a window size in the window size text box.
Whenever MacVector finds an exact match, it examines the segment (or window) of the aligned sequence that surrounds the matching region. The length of the segment is the value typed in for window size.
7. Enter a minimum score in the min. % score text box.
Whenever MacVector finds an exact match, it computes a total score for the window using the match / mismatch scores in the scoring matrix. It then determines a percent score by dividing the window's score by the score that would occur if all of the residues in the window matched. If this percent score equals or exceeds the value of min. % score, the window is saved.
8. Choose a value from the hash value drop-down menu.
The hash value is a measure of how long an exact match between two sequences must be before MacVector will attempt to score and align that matching region. A hash value of 1 is the most sensitive, 2 is the least sensitive.
TIP: For most comparisons, start with a hash value of 2, because it is unusual for two sequences to possess significant similarity without having regions of that size that match exactly.
9. Select OK to perform the analysis.
Alternatively, select Defaults to restore the default settings, or Cancel to close the dialog box without performing the analysis.
When the analysis is complete, the Protein Matrix Analysis display dialog box is displayed. This dialog box is described in the Displaying matrix results help topic.