MacVector icon

Assembler

MacVector has an integrated contig assembly tool Assembler. Assembler allows you to create de novo assemblies and reference assemblies using Velvet, Bowtie and Phrap.

To determine if you have this module installed, select the File | New menu and look for an item called Assembly Project; at the end of the sub-menu. If this is not present, you should contact MacVector sales and support for information on how to enable this functionality.

Start with Assembling Sequences and Contig Assembly Quick Start for more details about Assembler.

Assembler tasks

De novo assembly of Sanger trace reads

Reads can be assembled to find as many contigs as possible amongst them. This works for traditional Sanger sequencing as well as NGS reads.

De novo assembly of next generation reads

Paired and unpaired reads in Fastq format can be assembled quickly with Velvet. Velvet is ideal for assembling Illumina sequencing reads of bacterial genomes on a mid range Mac. For example a 7Mb genome with an N50 of 103kbp was produced from 16 million paired Illumina reads in one hour on a 16GB 2007 Mac Pro. With paired read data it produces very good contigs.

Reference assembly to identify SNPs in a bacterial/viral isolate

Add a Reference sequence and an NGS file(s) representing the sequence of an individual isolate. Run Bowtie, with or without paired end reads. View a report listing all the potential SNPs based on the differences between the consensus and the reference. Be able to quickly identify the genes that the SNPs lie in and drill down to view the nucleotide and amino acid changes.

De novo bacterial assembly assisted by a Reference scaffold

In this approach, a reference assembly using Bowtie is first run against all the Reads in an NGS file(s). The individual Contig consensus sequences are then assembled with a de novo assembler along with all the input NGS Reads that did not assemble.

Reference assembly of a new strain.

Taking reads from a new strain of an organism. Aligning reads against the template of a related strain to look for rearrangements and other variants (not just SNPs). Finding the differences helped identify the 'new genes' (transferred from another organism) that caused the toxicity

Assembly to multiple similar References

In this approach, you might have a series of closely related strains of virus or bacteria. The Reads come from a single isolate and are assembled against the collection of Reference sequences to determine which of the References is most closely related (or identical) to the isolate.

Assembly to multiple dissimilar References to identify SNPs

Essentially similar to the bacterial SNP assembly, but using yeast or some other organism with multiple genomes (or even some bacteria that have multiple chromosomes or large plasmids).

Exome sequencing (also called transcriptome sequencing)

Aligning all coding regions of an organism, for example that which has been obtained by sequencing mRNA/cDNA from an organism, and aligning it against the full genomic sequence.

Cancer sequencing

Sequencing the transcriptome (i.e. mRNA) of a cancer tumour and aligning against the reference sequence of the non cancerous genome. Finding what changes (i.e. mutations) have happened to the tumour to make it cancerous.

Ribosomal RNA sequencing

Align all sequenced rRNA in an organism against a single gene to find variants amongst the different copies.

Related Topics.

Align to Reference

Assembling sequences

Automatic Assembly of Sub-projects with Phrap

Short Read Assembly

Assemble reads against a reference sequence with Bowtie2.

Assemble reads against a reference sequence with Minimap2.

SNP Report