Q5 – Classification of Complete Mass Spectra of a Complex Protein Mixture

Q5

:: DESCRIPTION

Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally efficient; it is non-iterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques, and can provide clues as to the molecular identities of differentially-expressed proteins and peptides.

::DEVELOPER

Donald Lab at Duke University

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows / Linux / MacOSX
  • Matlab

:: DOWNLOAD

Q5

:: MORE INFORMATION

Citation

Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
Ryan H. Lilien, Hany Farid and Bruce R. Donald
Journal of Computational Biology, 2003; 10(6): 925-946.

ganon 1.0.0 – Read Classification tool based on Interleaved Bloom Filters

ganon 1.0.0

:: DESCRIPTION

ganon is a k-mer based DNA classification tool using Interleaved Bloom Filters for short reads.

::DEVELOPER

Vitor C. Piro

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

ganon

:: MORE INFORMATION

Citation

Piro VC, Dadi TH, Seiler E, Reinert K, Renard BY.
ganon: precise metagenomics classification against large and up-to-date sets of reference sequences.
Bioinformatics. 2020 Jul 1;36(Suppl_1):i12-i20. doi: 10.1093/bioinformatics/btaa458. PMID: 32657362; PMCID: PMC7355301.

pathClass 0.9.4 – Classification using Biological Pathways as prior knowledge

pathClass 0.9.4

:: DESCRIPTION

pathClass is a collection of classification methods that use information about feature connectivity in a biological network as an additional source of information. This additional knowledge is incorporated into the classification a priori. Several authors have shown that this approach significantly increases the classification performance.

::DEVELOPER

pathClass team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/Linux/MacOsX
  • R

:: DOWNLOAD

 pathClass

:: MORE INFORMATION

Citation

Bioinformatics. 2011 May 15;27(10):1442-3. doi: 10.1093/bioinformatics/btr157. Epub 2011 Mar 30.
pathClass: an R-package for integration of pathway knowledge into support vector machines for biomarker discovery.
Johannes M1, Fröhlich H, Sültmann H, Beissbarth T.

EFFECT 2013 – Automated Construction and Extraction of Features for Classification of Biological Sequences

EFFECT 2013

:: DESCRIPTION

EFFECT is an algorithmic framework for automated detection of functional signals in biological sequences.

::DEVELOPER

Computational Biology lab, George Mason University

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/Linux/MacOsX
  • Java
  • BioJava

:: DOWNLOAD

 EFFECT

:: MORE INFORMATION

Citation

Kamath U, De Jong K, Shehu A.
Effective automated feature construction and selection for classification of biological sequences.
PLoS One. 2014 Jul 17;9(7):e99982. doi: 10.1371/journal.pone.0099982. PMID: 25033270; PMCID: PMC4102475.

BASiNET 0.0.4 – Classification of RNA Sequences using Complex Network Theory

BASiNET 0.0.4

:: DESCRIPTION

BASiNET is an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements.

::DEVELOPER

Eric Augusto Ito <ericaugustoito at hotmail.com>

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows /Linux / MacOs
  • R

:: DOWNLOAD

BASiNET

:: MORE INFORMATION

Citation:

Ito EA, Katahira I, Vicente FFDR, Pereira LFP, Lopes FM.
BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.
Nucleic Acids Res. 2018 Sep 19;46(16):e96. doi: 10.1093/nar/gky462. PMID: 29873784; PMCID: PMC6144827.

Bis-class v2 – Bisulfite-sequencing data Classification

Bis-class v2

:: DESCRIPTION

Bis-Class is a tool which is made for classifying methylation status from BS-seq data. This method works best especially when whole methylation level is low and coverage is also low.

::DEVELOPER

Bioinformatics and Biostatistics Lab, Seoul National University

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux / Windows

:: DOWNLOAD

 Bis-Class

:: MORE INFORMATION

Citation

BMC Genomics. 2014 Jul 18;15:608. doi: 10.1186/1471-2164-15-608.
Bis-class: a new classification tool of methylation status using bayes classifier and local methylation information.
Huh I, Yang X, Park T1, Yi SV.

eCAMI – Simultaneous Classification and Motif Identification for enzyme/CAZyme annotation

eCAMI

:: DESCRIPTION

eCAMI is a Python package: (i) has the best performance in terms of accuracy and memory use for CAZyme and enzyme EC classification and annotation; (ii) the k-mer-based tools (including PPR-Hotpep, CUPP and eCAMI) perform better than homology-based tools and deep-learning tools in enzyme EC prediction.

::DEVELOPER

YIN LAB @ UNL & ZHANG LAB @ NKU

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • Python

:: DOWNLOAD

eCAMI

:: MORE INFORMATION

Citation

Xu J, Zhang H, Zheng J, Dovoedo P, Yin Y.
eCAMI: simultaneous classification and motif identification for enzyme annotation.
Bioinformatics. 2020 Apr 1;36(7):2068-2075. doi: 10.1093/bioinformatics/btz908. PMID: 31794006.

RNA-CODE – Noncoding RNA Classification tool

RNA-CODE

:: DESCRIPTION

RNA-CODE is a comprehensive ncRNA classification tool for very short reads. It is specifically designed for ncRNA identification in NGS data that lack quality reference genomes.

::DEVELOPER

RNA-CODE team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/Linux/MacOsX
  • Python

:: DOWNLOAD

 RNA-CODE

:: MORE INFORMATION

Citation:

PLoS One. 2013 Oct 25;8(10):e77596. doi: 10.1371/journal.pone.0077596. eCollection 2013.
RNA-CODE: a noncoding RNA classification tool for short reads in NGS data lacking reference genomes.
Yuan C1, Sun Y.

HMM-FRAME 20140724 – Protein Domain Classification for Sequencing Reads with Frameshift Errors

HMM-FRAME 20140724

:: DESCRIPTION

HMM-FRAME is a protein domain classification tool based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. It achieved high error detection sensitivity and specificity in a data set with annotated errors.

::DEVELOPER

Yanni Sun

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • G++

:: DOWNLOAD

 HMM-FRAME

:: MORE INFORMATION

Citation

BMC Bioinformatics. 2011 May 24;12:198. doi: 10.1186/1471-2105-12-198.
HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors.
Zhang Y, Sun Y.

MetaDomain 20140716 – HMM-based Protein Domain Classification tool for Short Sequence

MetaDomain 20140716

:: DESCRIPTION

MetaDomain can accurately align short reads generated by new sequencing technologies to to protein domains and annotate domain expression level.

::DEVELOPER

Dr. Yanni Sun

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • G++

:: DOWNLOAD

 MetaDomain

:: MORE INFORMATION

Citation

Pac Symp Biocomput. 2012:271-82.
Metadomain: a profile HMM-based protein domain classification tool for short sequences.
Zhang Y, Sun Y.