NLProt 2.0 – Mining Natural Language text for PROTein names and their UniProt-IDs

NLProt 2.0

:: DESCRIPTION

NLProt is a tool for finding protein-names in natural language-text. It is based on Support Vector Machines (SVMs), which are trained on contextual-features of named entities in scientific language. Additionally, simple filtering rules and a protein-name dictionary are used to increase performance. NLProt reached a precicion (accuracy) of 70% at a recall (coverage) of 85% after running it on the 166 most recent abstracts of EMBL and Cell

::DEVELOPER

Abecasis Lab

:: SCREENSHOTS

:: REQUIREMENTS

  • Linux / Mac OsX

:: DOWNLOAD

  NLProt

:: MORE INFORMATION

Citation

NLProt: extracting protein names and sequences from papers.
Mika S, Rost B.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W634-7.

RenBio 0.7d – Identify Gene and Protein Name in Textual Document

RenBio 0.7d

:: DESCRIPTION

RenBio is a program to identify gene and protein names in a textual document based on machine learning techniques.RenBio searches for named entities in a document according to a decision tree. The attributes of the tree nodes may be regex matches, dictionary matches or signa words.

::DEVELOPER

Robert Bossy <Robert.Bossy@jouy.inra.fr>

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

  RenBio

:: MORE INFORMATION