PacBio sequencers produced two types of characteristic reads: CCS (short and low error rate) and CLR (long and high error rate), both of which could be useful for de novo assembly of genomes. PBSIM simulates those PacBio reads by using either a model-based or sampling-based simulation.
harp : Maximum likelihood estimation of frequencies of known haplotypes from pooled sequence data. harp implements an EM algorithm to calculate the frequencies of known haplotypes from pooled sequence data.
RAUR is to re-align the reads that can not be mapped by alignment tools. It takes advantages of the base quality scores (reported by the sequencer) to figure out the longest segment of a read with at most K low quality bases. Combined with an alignment tool (like bwa or bowtie2), RAUR re-align the trimmed reads.
absee is a Ruby gem that reads ABIF files (DNA sequencing chromatograms). The software extracts the peak indexes, called sequence, and ACGT values from sequencing files
Pathoscope takes a next-generation sequencing reads from a mixture sample of multiple strains of genomes and it predicts which genomes potentially belongs there. Different from most of approach including composition method or similarity search with a daunting task of de novo assembly, the software applies the propagation of evidence in the Bayesian framework to an initial alignment result and reassign an correct membership of mapping by using the expectation and maximization algorithm.
Clinical Pathoscope is a program to identify pathogens/commensals/contaminants in unassembled sequencing reads.
Btrim is a fast and lightweight software to trim adapters and low quality regions in reads from ultra high-throughput next-generation sequencing machines. It also can reliably identify barcodes and assign the reads to the original samples.