MacVector icon

De novo assembly with Flye

Flye is a de novo assembler tuned for error-prone long reads such as those produced by Pacific Biosciences and Oxford Nanopore Technologies sequencers. It is remarkably fast and can often produce full length complete genome assemblies of smaller genomes (e.g. less than 20 Mbp) though, in the absence of additional high quality reads, the final consensus sequences tend to have a fairly high error rate, particularly in homopolymers runs. Nonetheless, Flye can be a very valuable tool for analyzing genomes as the long reads can get past typical repeat sequences (Insertion Sequences, rRNA operons etc) to generate the correct structure of a genome.

The general steps to run Flye are;

  1. Create a new Assembly Project FILE | NEW | ASSEMBLY PROJECT
  2. Click on the Add Reads button to add PacBio or Oxford Nanopore reads in fasta, fastq or gzipped (.gz) format.
  3. Double-click on the Status column of the imported read file(s) and set the data type to "PacBio" or "Oxford Nanopore" as appropriate.
  4. Select the file(s) you want to assemble and click on the Flye toolbar button.
  5. Set the Expected genome size to the approximate size of the genome size you are using. If in doubt, err on the small side.
  6. Choose a suitable Initial minimum coverage. This, along with Expected genome size, is one of the two most critical parameters. The default is 50, but you may try as low as 20 or as high as 500. Sometimes, depending on your input data, even the difference between 50 and 60 can be important. In general, larger is typically better, at the expense of longer processing time, but there are many occasions where smaller values do best.

Expect Flye to take anywhere from less than 5 minutes (a small bacterial genome with minimal options) to overnight for larger genomes with multiple rounds of polishing. It is not currently realistic to expect to assemble large mammalian or plant genomes using Flye on a single Macintosh computer.

See the Flye help topic for more details.

Related Topics.

Assembler

Quick Start

Assembling sequences

Short Read Assembly

Base calling

Importing existing assemblies to an Assembly Project

Bowtie

SPAdes

Velvet

Importing Fastq data