The Contig Editor is used for both editing Contigs in Assembler and as the main editor in Align to Reference. However, the Editor's interface does change depending on which function it is being used for. This page deals with the Contig Editor as used in the Align to Reference context. Please see Contig Editor - Assembler to see how this is used for editing contigs produce by Assembler.
As well as aligning sequencing reads against a template sequence Align To Reference may also be used for aligning cDNAs against a genomic template.Where no trace file data is present in any sequences then the lower panel will be hidden.
The Align to Reference window is highly interactive. All of the sequence residues can be edited, although there are some restrictions that simplify the use of the editor for sequence confirmation and SNP identification purposes. As well as viewing the sequence in the main view of the window, this editor has multiple views in the same way that the Single Sequence editor and the The MSA Editor have. These are accessible using the various tabs below the toolbar. The Map View will show the Features table of the main sequence. It will also show aligned and unaligned reads. This view is connected to the sequence view and with two Replica windows open clicking a feature will move to and show that selected sequence in the sequence window. To select a single feature of a multi segmented feature hold down the OPTION key whilst clicking on that feature.
There are two ways to open a Align to Reference window;
- with a nucleic acid window frontmost, choose ANALYZE - ALIGN TO REFERENCE
- open a previously saved alignment
The window is organized into two distinct panes;
- the upper pane has the reference sequence displayed across the top. Immediately under that is the consensus sequence of the aligned sample sequences. If no sequences align at a position, the consensus will be blank. The sample sequences themselves are displayed below the consensus, with each sequence on a separate line. Unaligned sequences are displayed in italics.
- the lower pane is used to display the sequences that have been aligned. If you have chromatogram sequences, then it will also show the traces for those sample sequences that came from automated sequencing machines. Only those sequences that overlap the current selection position are displayed in the pane.
As with the Single Sequence editor there are a number of different views for presenting the alignment. These are accessible from the tabs along the top of each window below the toolbar or by clicking the replica button:
- The Map view tab displays an annotated view of the features table along the sequence of the sequence. The map can be edited in many ways to give a tailored map for on-screen or printed presentation. In addition to any features inherited from the reference sequence, the sample sequences are shown in the map as "Read" features.
- The Features tab button opens an editable window that contains the Features table for the reference sequence. This is exactly equivalent to the Features table in the normal single sequence editor. Changes you make here will be saved in the assembly and also if you choose to save the reference sequence in MacVector single nucleic acid format. Sample sequences added to the assembly are shown here as "Read" (assembled) or "Read*" (unassembled) features.
- The Annotations tab is exactly equivalent to theAnnotations tab window in the normal single sequence editor. Changes you make here will be saved in the assembly and also if you choose to save the reference sequence in MacVector single nucleic acid format.
The toolbar displays varions buttons depending on which view is shown.;
- A Sequence Type indicator. This is simply an icon to indicate that the assembly represents a DNA sequence. It cannot be clicked.
- The blocking control is a slider that controls the horizontal scale of the displayed chromatogram. Drag the control with the mouse to adjust the horizontal resolution of the trace display. By default, MacVector uses a scaling of one pixel per sample time point.
- The plus button lets you add trace files or MacVector sequence files to the assembly. This is a shortcut to the main menu item "Edit | Add Sequences from File". Newly added sequences will appear in italics in the upper pane, indicating they have not been assembled.
- Clicking the annotated sequence button opens the Annotated Sequence window, which contains the sequence text with features marked along its length. The format is controlled by the Format annotated display in the MacVector Preferences. The information in the window cannot be changed, but it can be highlighted and copied to the Clipboard. The residues of the aligned sample sequences are NOT displayed in this window.
- Clicking on the circular arrow button opens the assembly parameters dialog. Use this to automatically assemble the sample sequences with the reference sequence. See the "Automatic assembly" topic for more details. This button is a shortcut to the "Analyze ! Sequence Confirmation..." main menu item.
- The button with the small "AGCT" characters is the "show dots" toggle. Clicking on this button toggles the display mode so that sample sequence and consensus residues that match the reference sequence are displayed as dots. Clicking a second time turns the "sow dots" display off. Use this to quickly visually scan an assembly for mismatched residues.
- Clicking on the find mismatches button (represented by a small pair of binoculars), opens a floating Find Mismatches window. You can use this to quickly find and jump to mismatches between the reference and consensus sequences.
- Click on the preferences button to open the Assembly Preferences dialog. There are only two parameters that can be changed;
- The Threshold setting controls how the consensus sequence is calculated. The algorithm simply counts up each of the overlapping sample residues at each position (taking the IUPAC ambiguity code into account) and assigns the consensus only if one of the bases exceeds the threshold percentage. Otherwise, an appropriate ambiguity code is assigned. Unassembled and masked sequences are not counted in the consensus calculation.
- The Allow ambiguous matches checkbox determines if IUPAC ambiguity codes should be used when the "show dots" display toggle is active.