IDBA is a practical iterative De Bruijn Graph De Novo Assembler for sequence assembly in bioinfomatics. Most assemblers based on de Bruijn graph build a de Bruijn graph with a specific k to perform the assembling task. For all of them, it is very crucial to find a specific value of k. If k is too large, there will be a lot of gap problems in the graph. If k is too small, there will a lot of branch problems. IDBA uses not only one specific k but a range of k values to build the iterative de Bruijn graph. It can keep all the information in graphs with different k values. So, it will perform better than other assemblers.
IDBA-UD is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.
Yu Peng, Henry Leung, S.M. Yiu, Francis Y.L. Chin. IDBA – A Practical Iterative de Bruijn Graph De Novo Assembler
The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010.
DNA Dragon Contig Assembler assembles sequences, trace data (ABI, SCF, AB1), Illumina and Roche 454 flowgrams into contigs. It is a very fast and accurate DNA sequence assembly software. The DNA sequences are assembled into contigs and a direct comparision of trace date with nucleotide data is possible. It also allows for proofreading and base editing.
VICUNA is a de novo assembly program targeting populations with high mutation rates. It creates a single linear representation of the mixed population on which intra-host variants can be mapped. For clinical samples rich in contamination (e.g., >95%), VICUNA can leverage existing genomes, if available, to assemble only target-alike reads. After initial assembly, it can also use existing genomes to perform guided merging of contigs. For each data set (e.g., Illumina paired read, 454), VICUNA outputs consensus sequence(s) and the corresponding multiple sequence alignment of constituent reads.
Xiao Yang, Patrick Charlebois, Sante Gnerre, Matthew G Coole, Niall J. Lennon, Joshua Z. Levin, James Qu, Elizabeth M. Ryan, Michael C. Zody, and Matthew R. Henn (2012) De novo assembly of highly diverse viral populations.
BMC Genomics 13:475.
QSRA (Quality-value-guided Short Read Assembler) is a quality-value guided de novo short read assembler. QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.
PASQUAL (PArallel SeQUence AssembLer) is designed for shared memory parallelism, using OpenMP due to its good tradeoff between performance and programmer productivity. Shared memory parallelism has become mainstream with the widespread production of multicore commodity processors. For PASQUAL we follow the OLC approach and use a careful combination of tailored algorithms and data structures to obtain high-quality solutions.
Xing Liu, Pushkar R. Pande, Henning Meyerhenke, and David A. Bader.
PASQUAL: A Parallel de novo Assembler for Next Generation Genome Sequencing.
Submitted for journal publication, 2011.
SHORTY is targetted for de novo assembly of microreads with mate pair information and sequencing errors. SHORTY has some novel approach and features in addressing the short read assembly problem.
VCAKE (Verified Consensus Assembly by K-mer Extension) is a genetic sequence assembler capable of assembling millions of small nucleotide reads even in the presence of sequencing error. This software is currently geared towards de novo assembly of Illumina’s Solexa Sequencing data.
SHARCGS is a DNA assembly program designed for de novo assembly of 25-40mer input fragments and deep sequence coverage. Large numbers of such short reads are generated on advanced sequencing instruments.