Pathoscope takes a next-generation sequencing reads from a mixture sample of multiple strains of genomes and it predicts which genomes potentially belongs there. Different from most of approach including composition method or similarity search with a daunting task of de novo assembly, the software applies the propagation of evidence in the Bayesian framework to an initial alignment result and reassign an correct membership of mapping by using the expectation and maximization algorithm.
Clinical Pathoscope is a program to identify pathogens/commensals/contaminants in unassembled sequencing reads.
RAUR is to re-align the reads that can not be mapped by alignment tools. It takes advantages of the base quality scores (reported by the sequencer) to figure out the longest segment of a read with at most K low quality bases. Combined with an alignment tool (like bwa or bowtie2), RAUR re-align the trimmed reads.
harp : Maximum likelihood estimation of frequencies of known haplotypes from pooled sequence data. harp implements an EM algorithm to calculate the frequencies of known haplotypes from pooled sequence data.
Gk-arrays are provided as a simple-to-use C++ library dedicated to queries on large collection of sequences as produced by high-throughput sequencers (e.g. HiSeq 2000 from Illumina, 454 from Roche).Gk-arrays index k-mers of reads and allow to answer different queries on that read collection.
PacBio sequencers produced two types of characteristic reads: CCS (short and low error rate) and CLR (long and high error rate), both of which could be useful for de novo assembly of genomes. PBSIM simulates those PacBio reads by using either a model-based or sampling-based simulation.
Btrim is a fast and lightweight software to trim adapters and low quality regions in reads from ultra high-throughput next-generation sequencing machines. It also can reliably identify barcodes and assign the reads to the original samples.