UProC 1.2.0 – Tools for Ultra-fast Protein Sequence Classification

UProC 1.2.0

:: DESCRIPTION

The UProC (ultrafast protein classification) toolbox implements a novel algorithm (“Mosaic Matching”) for large-scale sequence analysis and is now available in terms of an open source C library. UProC is up to three orders of magnitude faster than profile-based methods and achieved up to 80% higher sensitivity on unassembled short reads (100 bp) from simulated metagenomes. UProC does not depend on a multiple alignment of family-specific sequences. Therefore, in addition to the protein domain classfication according to the Pfam database, UProC can, in principle, also provide the detection of KEGG Orthologs

::DEVELOPER

Dr. Peter Meinicke

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Windows/Linux/MacOsX
C++ Compiler

:: DOWNLOAD

UProC

:: MORE INFORMATION

Citation

Bioinformatics. 2014 Dec 23. pii: btu843.
UProC: tools for ultra-fast protein domain classification.
Meinicke P

PISCES 20210711 – Protein Sequence Culling Server

PISCES 20210711

:: DESCRIPTION

PISCES (Protein Sequence Culling Server) is a database server for producing lists of sequences from the Protein Data Bank (PDB) using a number of entry- and chain-specific criteria and mutual sequence identity.

::DEVELOPER

Dunbrack Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Windows / Mac / Linux
Perl

:: DOWNLOAD

PISCES

:: MORE INFORMATION

Citation

Guoli Wang and Roland L. Dunbrack Jr.
“PISCES: a protein sequence culling server“,
Bioinformatics 19:1589-1591, 2003.

C-HMM 1.0 – Program to Identify Remote Homologues from Protein Sequence Database

C-HMM 1.0

:: DESCRIPTION

C-HMM is a software to detect remote/distant homologues from protein sequence databases. It is based on HMMs(Hidden Markov Models) for identifying the deep evolutionary relationships of protein sequences.

::DEVELOPER

C-HMM team

:: SCREENSHOTS

N/A

::REQUIREMENTS

Linux / MacOsX
Java
HMMER3

:: DOWNLOAD

C-HMM

:: MORE INFORMATION

Citation

Rapid and Enhanced Remote Homology Detection by Cascacading Hidden Markov Model Searches in Sequence Space.
Kaushik S, Nair AG, Mutt E, Prasanna H, Sowdhamini R.
Bioinformatics. 2015 Oct 10. pii: btv538.

CS-PSeq-Gen 1.0 – Simulation of Protein Sequences under Constraints

CS-PSeq-Gen 1.0

:: DESCRIPTION

CS-PSeq-Gen is a program derived from PSeq-Gen, a program developed by Nick C. Grassly and Andrew Rambaut, designed to simulate the evolution of protein sequences along evolutionary trees. CS-PSeq-Gen modifications are related to the aim of simulating the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny: the “root sequence” that initiates the simulation, or the rate heterogeneity among sites are specific on each particular protein family. CS-Pseq-Gen will allow simulations to take such information into account. As well, exploring the evolution of one protein family and testing hypotheses makes often it necessary to have some control on the variability of the parameters. CS-PSeq-Gen will allow some control on the simulated tree / branch lengths around an average value. Finally, a particular category of applications for such simulations is the search for the significant co-evolution of sites. CS-PSeq-Gen offers some facilities to generate sequences under such hypotheses, and propose a basic scheme for their detection, that can be easily adapted by programmers.

::DEVELOPER

P. Tufféry (tuffery@ebgm.jussieu.fr )

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux

:: DOWNLOAD

CS-PSeq-Gen

:: MORE INFORMATION

Citation

Tufféry, P.
CS-PSeq-Gen: Simulating the evolution of protein sequence under constraints
Bioinformatics, Volume 18, Number 7, July 2002 , pp. 1015-1016(2)

tantan 22 – Find Cryptic Repeats in DNA, RNA, and Protein Sequences.

tantan 22

:: DESCRIPTION

tantan is a tool to mask simple regions (low complexity and short-period tandem repeats) in DNA, RNA, and protein sequences.The aim of tantan is to prevent false predictions when searching for homologous regions between two sequences. Simple repeats often align strongly to each other, causing false homology predictions.

::DEVELOPER

Martin C. Frith

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C++ compiler

:: DOWNLOAD

tantan

:: MORE INFORMATION

Citation

MC Frith,
A new repeat-masking method enables specific detection of homologous sequences,
Nucleic Acids Research 2011 39(4):e23.

AMS 4.0-1.5 – Consensus Prediction of Post-translational Modifications in Protein Sequences

AMS 4.0-1.5

:: DESCRIPTION

AMS (AutoMotif Service) predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences.

::DEVELOPER

AMS team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C Compiler

:: DOWNLOAD

AMS

:: MORE INFORMATION

Citation

Amino Acids. 2012 Aug;43(2):573-82. doi: 10.1007/s00726-012-1290-2.
AMS 4.0: consensus prediction of post-translational modifications in protein sequences.
Plewczynski D, Basu S, Saha I.

MMseqs2 R13 – ultra Fast and Sensitive Protein Search and Clustering suite

MMseqs2 R13

:: DESCRIPTION

MMseqs2 (Many-against-Many sequence searching) is a software suite for very fast protein sequence searches and clustering of huge protein sequence data sets.

::DEVELOPER

Söding Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C Compiler

:: DOWNLOAD

MMseqs2

:: MORE INFORMATION

Citation:

MMseqs software suite for fast and deep clustering and searching of large protein sequence sets.
Hauser M, Steinegger M, Söding J.
Bioinformatics. 2016 Jan 6. pii: btw006.

Steinegger M and Soeding J.
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets.
Nature Biotechnology, doi: 10.1038/nbt.3988 (2017).

OMiMa – Identify Functional Motifs in DNA or Protein Sequences

OMiMa

:: DESCRIPTION

The OMiMa (the Optimized Mixture Markov model) System is a computational tool for identifying functional motifs in DNA or protein sequences. OMiMa System is based on the Optimized Mixture of Markov models that are able to incorporate most dependencies within a motif. Most important, OMiMa is capable to adjust model complexity according to motif dependency structures, so it can minimize model complexity without compromising prediction accuracy. OMiMa uses our fast Markov chain optimization method, the Directed Neighbor-Joining (DNJ), which makes OMiMa more computationally efficent.

::DEVELOPER

OMiMa team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux

:: DOWNLOAD

OMiMa

:: MORE INFORMATION

Citation

Weichun Huang, David M Umbach, Uwe Ohler, Leping Li.
Optimized mixed Markov models for motif identification.
BMC Bioinformatics 2006, 7:279

FoldAmyloid – Prediction of Amyloidogenic Regions from Protein Sequence

FoldAmyloid

:: DESCRIPTION

FoldAmyloid is a web server for the prediction of amyloidogenic regions in protein chain

::DEVELOPER

the BioInformatics Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Web Browser

:: DOWNLOAD

:: MORE INFORMATION

Citation

Bioinformatics. 2010 Feb 1;26(3):326-32. doi: 10.1093/bioinformatics/btp691. Epub 2009 Dec 17.
FoldAmyloid: a method of prediction of amyloidogenic regions from protein sequence.
Garbuzynskiy SO1, Lobanov MY, Galzitskaya OV.

pviz 0.1.12 – Dynamic JavaScript & SVG library for Visualization of Protein Sequence Features

pviz 0.1.12

:: DESCRIPTION

pViz.js is a visualization library for displaying protein sequence features in a web browser

::DEVELOPER

pviz team

:: SCREENSHOTS

:: REQUIREMENTS

Windows/Linux/MacOsX
Java

:: DOWNLOAD

pviz

:: MORE INFORMATION

Citation

Visualization of protein sequence features using JavaScript and SVG with pViz.js.
Mukhyala K, Masselot A.
Bioinformatics. 2014 Aug 21. pii: btu567.