FastQC 0.11.9 – Quality Control Tool for High Throughput Sequence Data

FastQC 0.11.9

:: DESCRIPTION

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

:: DEVELOPER

Babraham Bioinformatics

:: SCREENSHOTS

:: REQUIREMENTS

  • Linux / Windows / MacOsX
  • Java 

:: DOWNLOAD

 FastQC

:: MORE INFORMATION

BM-BC 1.0 – Bayesian method of Base Calling for Solexa Sequence data

BM-BC 1.0

:: DESCRIPTION

BM-BC is a Bayesian method of base calling for Solexa-GA sequencing data. The Bayesian method builds on a hierarchical model that accounts for three sources of noise in the data, which are known to affect the accuracy of the base calls: fading, phasing, and cross-talk between channels.

::DEVELOPER

Yuan Ji Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux / Windows / MacOsX
  • R package

:: DOWNLOAD

 BM-BC

:: MORE INFORMATION

Citation

BMC Bioinformatics. 2012;13 Suppl 13:S6. doi: 10.1186/1471-2105-13-S13-S6. Epub 2012 Aug 24.
BM-BC: a Bayesian method of base calling for Solexa sequence data.
Ji Y, Mitra R, Quintana F, Jara A, Mueller P, Liu P, Lu Y, Liang S.

diCal 1.3 / diCal-IBD 1.0 – Predicts Identical by Descent Tracts using Sequence data

diCal 1.3 / diCal-IBD 1.0

:: DESCRIPTION

diCal is a scalable demographic inference method based on the sequentially Markov conditional sampling distribution framework.

diCal-IBD can be used for predicting identical by descent (IBD) tracts in sequence data. It provides means for calculating the accuracy of the prediction, if the true tracts are available, plotting of the predicted tracts, their TMRCA (time to the most recent common ancestor) and corresponding posterior probabilities, and identification of putative recent positive selection through investigation of average IBD sharing

::DEVELOPER

The Biophysics Graduate Group, Yun S. Song

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux/MacOsX
  • Python

:: DOWNLOAD

  diCal-IBD , diCal

:: MORE INFORMATION

Citation

Genetics. 2013 Jul;194(3):647-62. doi: 10.1534/genetics.112.149096. Epub 2013 Apr 22.
Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.
Sheehan S1, Harris K, Song YS.

diCal-IBD: demography-aware inference of identity-by-descent tracts in unrelated individuals.
Tataru P, Nirody JA, Song YS.
Bioinformatics. 2014 Aug 21. pii: btu563

KGGSeq 1.2 – Genomic and Genetic studies using Sequence data

KGGSeq 1.2

:: DESCRIPTION

KGGSeq (Genomic and Genetic studies using Sequence data) is a software platform constituted of Bioinformatics and statistical genetics functions making use of valuable biologic resources and knowledge for sequencing-based genetic mapping of variants/genes responsible for human diseases/traits. Simply, KGGSeq is like a fishing rod facilitating geneticists to fish the genetic determinants of human diseases/traits in the big sea of DNA sequences. Compared with other genetic tools like plink/seq, KGGSeq paid more attention downstream analysis of genetic mapping. Currently, a comprehensive and efficient framework was newly implemented on KGGSeq to filter and prioritize genetic variants from whole exome sequencing data.

::DEVELOPER

Precision Medicine Genomics Laboratory

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows / Linux / MacOSX
  • Java

:: DOWNLOAD

 KGGSeq

:: MORE INFORMATION

Citation

Hum Mutat. 2015 Feb 10. doi: 10.1002/humu.22766.
wKGGSeq: A Comprehensive Strategy-Based and Disease-Targeted Online Framework to Facilitate Exome Sequencing Studies of Inherited Disorders.
Li MJ1, Deng J, Wang P, Yang W, Ho SL, Sham PC, Wang J, Li M.

Li MX, Gui HS, Kwan JS, Bao SY, Sham PC.
A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases.
Nucleic Acids Res. 2012 Jan 12

SeqPop – Compute Population Genetics Statistics on Sequence Data

SeqPop

:: DESCRIPTION

SeqPop is a program for computing population genetics statistics on sequence data, including Pn, Theta, Pi(i,j), Kst(*), Fst(*), and their Monte Carlo significance for population subdivision.

::DEVELOPER

the Townsend Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Mac

:: DOWNLOAD

  SeqPop

:: MORE INFORMATION

PRINSEQ 0.20.4 – Preprocess and Generate Statistics about Sequence data

PRINSEQ 0.20.4

:: DESCRIPTION

PRINSEQ (PReprocessing and INformation of SEQuence data.) is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. It is particular designed for 454/Roche data, but can also be used for other types of sequence data. PRINSEQ is available through a user-friendly web interface or as standalone version. The standalone version is primarily designed for data preprocessing and does not generate summary statistics in graphical form.

PRINSEQ Online Version

::DEVELOPER

the Edwards Lab

:: SCREENSHOTS

:: REQUIREMENTS

  • Windows / Mac OsX / Linux /
  • Perl

:: DOWNLOAD

 PRINSEQ

:: MORE INFORMATION

Citation:

Schmieder R and Edwards R
Quality control and preprocessing of metagenomic datasets.
Bioinformatics 2011, 27:863-864.

scan-x 1.1 – Find Motifs within any Sequence data set

scan-x 1.1

:: DESCRIPTION

scan-x is a software tool designed to find motifs within any sequence data set. The first large scale scan was performed using all available human, mouse, fly and yeast phosphorylation and acetylation data to perform a scan for undiscovered modification sites.

::DEVELOPER

Schwartz Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Web Browser

:: DOWNLOAD

  NO

:: MORE INFORMATION

Citation

Mol Cell Proteomics. 2009 Feb;8(2):365-79. doi: 10.1074/mcp.M800332-MCP200. Epub 2008 Oct 28.
Predicting protein post-translational modifications using meta-analysis of proteome scale data sets.
Schwartz D, Chou MF, Church GM.

Curr Protoc Bioinformatics. 2011 Dec;Chapter 13:Unit 13.16.. doi: 10.1002/0471250953.bi1316s36.
Using the scan-x Web site to predict protein post-translational modifications.
Chou MF, Schwartz D.

ReadTools 1.5.2 – Universal Toolkit for Handling Sequence data from different Sequencing Platforms

ReadTools 1.5.2

:: DESCRIPTION

ReadTools provides a consistent and highly tested set of tools for processing sequencing data from any kind of source and focusing on raw reads, while including tools for mapped reads as well.

DEVELOPER

Institute of Population Genetics, University of Veterinary Medicine Vienna

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux / MacOsX / Windows
  • Java

:: DOWNLOAD

ReadTools

:: MORE INFORMATION

Citation:

Mol Ecol Resour. 2018 May;18(3):676-680. doi: 10.1111/1755-0998.12741. Epub 2017 Dec 8.
ReadTools: A universal toolkit for handling sequence data from different sequencing platforms.
Gómez-Sánchez D,SchlöttererÇ

verifyBamID 1.1.3 – Identify Contamination of Sample Swap in Sequence data

verifyBamID 1.1.3

:: DESCRIPTION

verifyBamID is a software that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals), and checks whether the reads are contaminated as a mixture of two samples

::DEVELOPER

Abecasis Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
:: DOWNLOAD

 verifyBamID

:: MORE INFORMATION

Citation

G. Jun, M. Flickinger, K. N. Hetrick, Kurt, J. M. Romm, K. F. Doheny, G. Abecasis, M. Boehnke,and H. M. Kang,
Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data,
Am J Hum Genet. 2012 Nov 2;91(5):839-48. doi: 10.1016/j.ajhg.2012.09.004. (volume 91 issue 5 pp.839 – 848)

CS23D 2.0 – Protein Structure generation using NMR Chemical Shifts and Sequence data

CS23D 2.0

:: DESCRIPTION

CS23D (Chemical Shift to 3D Structure) is a web server for rapidly generating accurate 3D protein structures using only assigned NMR chemical shifts as input.

::DEVELOPER

the Wishart Research Group, University of Alberta

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Web Browser

:: DOWNLOAD

  NO

:: MORE INFORMATION

Citation

Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W496-502. doi: 10.1093/nar/gkn305. Epub 2008 May 30.
CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data.
Wishart DS, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G.