To run this analysis, at least one nucleic acid sequence file and one protein sequence file must be open, and a window containing a sequence must be the active window. A protein scoring matrix file must be open or available on disk.
When performing this analysis, MacVector first translates the nucleic acid sequence in the three reading frames of one of the DNA strands (if the strand options of ++ or +- are chosen) or in all six reading frames (if both strands are chosen). The comparison is then performed on the resulting amino acid sequences. The output is a two-dimensional plot, with the residues of the protein sequence along the X-axis, and the residues of the translated DNA sequence along the Y-axis. The aligned sequence display shows each aligned sequence from all the reading frames.
- Choose Analyze | Pustell Protein - DNA.
- In the left-hand scrolling list, select a protein sequence to display along the X-axis. Only protein sequences appear in this list.
- In the right-hand scrolling list, select a DNA sequence to display along the Y-axis. Only DNA sequences appear in this list.
- You can limit the analysis to a region of each of the sequences by typing in the numbers that bracket the region in the X-Region and Y-Region text boxes, or by selecting a region from the features table drop-down menu that appears to the right of the text boxes.
- Select the Scoring Matrix button to choose a scoring matrix file.
- Enter a window size in the window size text box. Whenever MacVector finds an exact match, it examines the segment (or window) of the aligned sequence that surrounds the matching region. The length of the segment is the value typed in for window size.
- Enter a minimum score in the min. % score text box. Whenever MacVector finds an exact match, it computes a total score for the window using the match / mismatch scores in the scoring matrix. It then determines a percent score by dividing the window's score by the score that would occur if all of the bases in the window matched. If this percent score equals or exceeds the value of min. % score, the window is saved.
- Choose a value from the hash value drop-down menu. The hash value is a measure of how long an exact match between two sequences must be before MacVector will attempt to score and align that matching region. A hash value of 1 is the most sensitive, 2 is the least sensitive.
- Use the strand drop-down menu to choose whether to use the plus strand of the DNA sequence (++), the reverse complement strand (+-), or both. For the initial analysis, we recommend that you choose both.
- Choose the genetic code for the translation of the DNA from the genetic code drop-down menu.
- Select OK to perform the analysis.
Alternatively, select Defaults to restore the default settings, or Cancel to close the dialog box without performing the analysis.
TIP: For most comparisons, start with a hash value of 2, because it is unusual for two sequences to possess significant similarity without having regions of that size that match exactly.
When the analysis is complete, the Protein & DNA Matrix Analysis display dialog box is displayed.
Related Topics.
Analyze Menus
Pustell matrix
Performing a matrix analysis
Displaying matrix results