PhyloCSF 20121028 / PhyloCSF++ v1.1.0 – Distinguish Protein-coding and Non-coding Regions

PhyloCSF 20121028 / PhyloCSF++ v1.1.0

:: DESCRIPTION

PhyloCSF is a method to determine whether a multi-species nucleotide sequence alignment is likely to represent a protein-coding region. PhyloCSF does not rely on homology to known protein sequences; instead, it examines evolutionary signatures characteristic to alignments of conserved coding regions, such as the high frequencies of synonymous codon substitutions and conservative amino acid substitutions, and the low frequencies of other missense and non-sense substitutions (CSF = Codon Substitution Frequencies).

PhyloCSF ++ is an efficient and parallelized C ++ implementation of the popular PhyloCSF method to distinguish protein-coding and non-coding regions in a genome based on multiple sequence alignments.

::DEVELOPER

Mike Lin / Christopher Pockrandt

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 PhyloCSF / PhyloCSF ++

:: MORE INFORMATION

Citation

Pockrandt C, Steinegger M, Salzberg SL.
PhyloCSF ++: A fast and user-friendly implementation of PhyloCSF with annotation tools.
Bioinformatics. 2021 Nov 4:btab756. doi: 10.1093/bioinformatics/btab756. Epub ahead of print. PMID: 34734986.

Lin MF, Jungreis I, and Kellis M (2011).
PhyloCSF: a comparative genomics method to distinguish protein-coding and non-coding regions.
Bioinformatics (2011) 27 (13): i275-i282.

ISOGO – Functional Annotation of Protein-coding Splice Variants

ISOGO

:: DESCRIPTION

ISOGO (ISOform + GO function imputation) is a prediction model of isoform functions based on correlation of isoform expression and protein domains.

::DEVELOPER

ISOGO team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/ Linux / MacOsX
  • R

:: DOWNLOAD

ISOGO

:: MORE INFORMATION

Citation

Ferrer-Bonsoms JA, Cassol I, Fernández-Acín P, Castilla C, Carazo F, Rubio A.
ISOGO: Functional annotation of protein-coding splice variants.
Sci Rep. 2020 Jan 23;10(1):1069. doi: 10.1038/s41598-020-57974-z. PMID: 31974522; PMCID: PMC6978412.

PaPI – Pseudo Amino Acid Composition to Score human Protein-coding Variants

PaPI

:: DESCRIPTION

PaPI is a new machine-learning approach to classify and score human coding variants by estimating the probability to damage their protein-related function.

::DEVELOPER

laboratorio di Bioinformatica e Biologia Sintetica – Univ. of Pavia

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Web browser

:: DOWNLOAD

 NO

:: MORE INFORMATION

Citation

PaPI: pseudo amino acid composition to score human protein-coding variants.
Limongelli I, Marini S, Bellazzi R.
BMC Bioinformatics. 2015 Apr 19;16(1):123. doi: 10.1186/s12859-015-0554-8.

CisRNA-SVM – Identification of Structured Cis-regulatory Elements in the 3’UTRs of human Protein-coding mRNAs

CisRNA-SVM

:: DESCRIPTION

CisRNA-SVM is a software for genome wide predictions of novel structured RNA cis-regulatory elements in human 3′ UTRs.

::DEVELOPER

Dr Chris Brown’s Research Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/ Linux
  • Perl

:: DOWNLOAD

CisRNA-SVM

:: MORE INFORMATION

Citation:

Nucleic Acids Res. 2012 Oct;40(18):8862-73. doi: 10.1093/nar/gks684. Epub 2012 Jul 20.
Computational identification of new structured cis-regulatory elements in the 3′-untranslated region of human protein coding genes.
Chen XS1, Brown CM.

CNCI v2 – Distinguish Protein-coding and Non-coding Sequences

CNCI v2

:: DESCRIPTION

CNCI (Coding-Non-Coding Index) is a powerful signature tool by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations.

::DEVELOPER

CNCI team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 CNCI

:: MORE INFORMATION

Citation

Nucleic Acids Res. 2013 Sep;41(17):e166. doi: 10.1093/nar/gkt646. Epub 2013 Jul 27.
Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts.
Sun L1, Luo H, Bu D, Zhao G, Yu K, Zhang C, Liu Y, Chen R, Zhao Y.

Prodigal 2.6.3 – Protein-coding Gene Prediction for Prokaryotic Genomes

Prodigal 2.6.3

:: DESCRIPTION

Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program.

::DEVELOPER

Doug Hyatt

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows/Linux/MacOsX

:: DOWNLOAD

 Prodigal

:: MORE INFORMATION

Citation

BMC Bioinformatics. 2010 Mar 8;11:119. doi: 10.1186/1471-2105-11-119.
Prodigal: prokaryotic gene recognition and translation initiation site identification.
Hyatt D1, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ.

CPC 2 – Assess Protein-coding Potential of Transcripts

CPC 2

:: DESCRIPTION

CPC (Coding Potential Calculator) is a software to assess the protein-coding potential of a transcript (i.e whether a cDNA/RNA transcript could encode a peptide or not) based on six biologically meaningful sequence features. Tenfold cross-validation on the training dataset and further testing on several large datasets showed that CPC can discriminate coding from noncoding transcripts with high accuracy. Furthermore, CPC also runs an order-of-magnitude faster than a previous state-of-the-art tool and has higher accuracy.

::DEVELOPER

Gao Lab, Peking University.

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 CPC

:: MORE INFORMATION

Citation:

Kang YJ, Yang DC, Kong L, Hou M, Meng YQ, Wei L, Gao G.
CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features.
Nucleic Acids Res. 2017 Jul 3;45(W1):W12-W16. doi: 10.1093/nar/gkx428. PMID: 28521017; PMCID: PMC5793834.

Kong, L., Y. Zhang, Z.Q. Ye, X.Q. Liu, S.Q. Zhao, L. Wei, and G. Gao. 2007.
CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine.
Nucleic Acids Res 36: W345-349.

Crann 1.04 – Detect Adaptive Evolution in Protein-coding DNA Sequences

Crann 1.04

:: DESCRIPTION

Crann (pronounced ‘crown’) is the Irish word for ‘tree’.Crann has been developed in order to provide fast heuristic methods of detecting adaptive evolution in protein-coding genes. It is important that the user understands the advantages and limitations of these methods. It is also important for the user to know that the software is designed to perform a number of different tasks, however the interpretation of the results is left entirely to the user.

::DEVELOPER

McInerney lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Windows / Linux / MacOS

:: DOWNLOAD

Crann

:: MORE INFORMATION

Citation:

Creevey, C. and J. O. McInerney (2003).
CRANN: Detecting adaptive evolution in protein-coding DNA sequences
Bioinformatics (2003) 19: 1726.

RNAcode 0.3 – Analysis of Protein Coding Potential in Multiple Sequence Alignments

RNAcode 0.3

:: DESCRIPTION

RNAcode predicts protein coding regions in a a set of homologous nucleotide sequences. RNAcode relies on evolutionary signatures including synonymous/conservative mutations and conservation of the reading frame. It does not use any species specific sequence characteristics whatsoever and does not use any machine learning techniques.

::DEVELOPER

Goldman Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 RNAcode

:: MORE INFORMATION

Citation:

RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.
Washietl S, Findeiss S, Müller SA, Kalkhof S, von Bergen M, Hofacker IL, Stadler PF, Goldman N.
RNA. 2011;17(4):578-94.

Exit mobile version