MacVector icon

The Contig Editor: Editing Align To Reference Alignments

By default all editing uses "overwrite" mode rather than "insertion" mode. However, if you hold down the OPTION key, then gaps or residues can be inserted instead.

- The numbering over the reference sequence is the original numbering of the sequence. Gaps are not included in the numbering, or in the locations shown in the Features Table so that these are directly comparable to the original sequence.

- You can edit the reference sequence at any time. To delete a base, simply type the <space> character. Visually, this converts it into a gap character but behind the scenes the base is physically deleted from the reference. If all of the sequences that overlap a position have gaps, the position will be "closed up" and the shared gaps deleted.

- You can edit any of the sample sequences (in either the upper or lower pane) in a similar way. As you type, the consensus sequence is updated in real time. If you delete the last non-gap character at a position, the gap will "close up" as with the reference sequence.

- You can copy any selection of more than one residue. There are important difference in the way the reference sequence is copied compared to the consensus or sample sequences;

- when the reference sequence has the primary highlight, the reference sequence and all overlapping features are copied to the clipboard. In this case, all gaps are stripped out so that you see the "real" sequence if you then paste the copied sequence into another window.

- when a sample sequence or the consensus is copied to the clipboard, only the actual sequence characters are copied, In addition, all gaps are retained in the sequence. The reason for this is...

- If you want to change the reference sequence to match the consensus sequence, select the region of the consensus that you want to copy. Choose Copy from the Edit menu, carefully place the cursor on the first residue in the reference sequence that you wish to replace, then choose Paste. The first time you do this you will get a warning message, but when you click "Paste Anyway" the reference sequence residues will be replaced by the copied consensus sequence.

- To remove a sequence from the alignment, click on the title button to select the entire sequence and press the <delete> key.

- Most actions are "undoable" to a single level. Note that if you are editing very large assemblies, certain actions (e.g. deleting one or more sequences) may be sluggish while a copy is made of the assembly to allow a later undo.

- To delete a residue, overwrite it with either a <space> or a "-" character. The consensus will be updated dynamically. If all overlapping sequences have a gap at that location, the gap will be closed up.

- Deletions at either end of a sequence are considered "true" deletions and the residues will be actually deleted, rather than replaced by gaps.

Editing the sequence

The following keys may be used to edit the alignment.

- any IUPAC letter replaces the base.

- any IUPAC + OPTION inserts that base

- Space bar or '-' replaces the base with a gap.

- OPTION + space bar or OPTION + '-' inserts a gap.

- double-click on a read to select the entire sequence, then the left cursor and the right cursor keys will "nudge" the read left or right. This only works for a selection of the entire sequence, and will not nudge exons or introns.

- [delete] or [backspace] replaces the base with a gap.

- OPTION + [delete] or [backspace] in a read physically deletes the base.

Edits in the template work the same way with the exception of OPTION-[delete]. This inserts gaps at that position in all the reads to maintain the alignment. If you need to remove the gaps then select and option-delete those gaps to force a one residue shift in the alignment.

Editing Limitations

The contig editor window is highly interactive. All of the sequence residues can be edited, although there are some restrictions that simplify the use of the editor:

- Although you can copy any number of residues, pasting is disabled for all sequences.

- You cannot perform a block selection of the sample sequences - only one sample sequence can have residues selected at any one time (although you can select multiple sample sequences to remove them from the assembly).

- You cannot edit the Consensus sequence, it is always calculated dynamically. However, you can select and copy residues on the consensus line.

Trimming assemblies by quality

You can trim reads based on the quality of a read. The reads (typically Sanger sequencing reads in ABI or SCF format) should not be aligned prior to running the algorithm. A typical workflow might thus be to add .ab1 or .scf files to a project, optionally base call with phred, if needed, then Trim by Quality prior to aligning or assembling.

  1. Add Sequencing reads
  2. ANALYZE | BASECALL (PHRED)
  3. ANALYZE | TRIM BY QUALITY
  4. ALIGN

Related Topics.

Assembler

SNP Report

Align to Reference

Map View

Consensus