MacVector icon

Open reading frames

Open reading frame (ORF) analysis enables you to find:

- open reading frames using user-designated start and stop codons

- regions of a nucleic acid that are likely to code for protein according to Fickett's TESTCODE algorithm.

To run the analysis, a nucleic acid sequence window must be the active window.If you want to define your own start and stop codons, you must edit the genetic code that you are using.

FINDING OPEN reading frames and coding regions

The ORF analysis and Fickett's coding region algorithm are initiated on the same dialog, and can be performed as a single analysis.

TO FIND an open reading frame

  1. Choose Analyze | Open Reading Frames.
  2. Select the start/stop codons check box.
  3. Specify the minimum amino acid length that you will consider to be a valid open reading frame in the min. # of amino acids text box. TIP: The majority of proteins are larger than 75 amino acids. However, eukaryotic exons can be as small as 20 to 25 amino acids.
  4. Select check boxes for treating the sequence ends as start and stop codons. The options are: 5' ends are starts, 3' ends are stops, codons after stops are starts. Normally you would want all three boxes checked, because MacVector defines an open reading frame as a region flanked by a start codon and a stop codon. If your sequence fragment lies in the middle of a coding region and contains no start or stop codons, for example, MacVector will not report any open reading frames unless you have checked both 5' ends are starts codon and 3' ends are stops.The codons after stops are starts option is recommended for eukaryote sequences where introns are common. Where there are internal coding exons, it finds that the longest possible ORFs, ensuring that the entire exon is captured.
  5. Select the genetic code that contains the start and stop codons you want to use from the genetic code drop-down menu.
  6. You can restrict the search to a certain region of the sequence by typing in the sequence numbers that bracket the region in the Region panel, or by selecting a region from the features table pop-up menu at the right of the text boxes.
  7. You can restrict the analysis to one or both strands, by selecting the appropriate item from the strand drop-down menu.
  8. Select OK to perform the analysis.

After the analysis is complete, the Open Reading Frame Analysis display dialog box is displayed.

TO FIND a coding region using Fickett's method

1. Choose Analyze | Open Reading Frames.

The Open Reading Frame Analysis dialog box is displayed.

2. Select the Fickett's method check box.

3. Type in the min. DNA length.

This should be at least 200 for best results.

4. Select a value for the coding probability from the min. coding probability drop-down menu.

Fickett's method assigns any region that has a coding probability below 0.29 as "noncoding" and any region with a coding probability greater than 0.92 as "coding." Regions with intermediate probabilities are assigned "no opinion." You can include these borderline regions with the "coding" regions by setting the probability of coding at successively lower numbers.

Note that Fickett's method is independent of start and stop codons, and it cannot distinguish between frames. It simply reports regions where the cutoff value is exceeded.

5. You can restrict the search to a certain region of the sequence by typing in the sequence numbers that bracket the region in the Region panel, or by selecting a region from the features table drop-down menu at the right of the text boxes.

6. You can restrict the analysis to one or both strands, by selecting the appropriate item from the strand drop-down menu.

7. Select OK to perform the analysis.

NOTE: You can find open reading frames at the same time (see the procedure above).

After the analysis is complete, the Open Reading Frame Analysis display dialog box is displayed. To use this dialog box, see the following procedure.

DISPLAYING ORF and coding region search results

The results of a reading frame analysis or coding region search can be displayed in several ways:

- as a text listing

- as a linear map

- as an annotated sequence.

If you have performed both types of analysis, the results appear together in the same display window.

Each display type can be saved to disk or printed when its window is active.

1. The Open Reading Frame Analysis display dialog box is displayed on completion of each analysis. To display this dialog box at other times, choose Analyze | Open Reading Frames when any open reading frame display result window is active.

2. Select List ORFs by to display a text window that contains a list of the open reading frames / coding regions found. The drop-down menu at the end of the line enables you to specify whether the list should be ordered according to position in the sequence or according to the lengths of the open reading frames / coding regions.

3. Select ORF map to display a linear map of the positions of the ORFs / coding regions.

4. Select ORF-annotated sequence to display an annotated sequence window that shows the nucleic acid sequence and the translated sequences of any ORFs or coding regions found by the analysis.

5. Select OK to display the requested results.

Related Topics.

Analyze Menus