EPC (Evolutionary Feature Construction) is a method for prediction of Antimicrobial Peptides by proposing more complex sequence-based features that are able to capture information about local and distal patterns within a peptide.
ChemSpot is a set of tools for named entity recognition and classification of chemicals in natural language texts, including trivial names, abbreviations, molecular formulas and IUPAC entities.
Neji is a innovative and powerfull framework for faster biomedical concept recognition. It is open source and built around four key characteristics: modularity, scalability, speed, and usability. Neji integrates modules of various state-of-the-art methods for biomedical natural language processing (e.g., sentence splitting, tokenization, lemmatization, part-of-speech tagging, chunking and dependency parsing) and concept recognition (e.g., dictionaries and machine learning). The most popular input and output formats, such as Pubmed XML, IeXML, CoNLL and A1, are also supported.
BANNER is a named entity recognition system, primarily intended for biomedical text. It is a machine-learning system based on conditional random fields and contains a wide survey of the best features in recent literature on biomedical named entity recognition (NER). BANNER is portable and is designed to maximize domain independence by not employing semantic features or rule-based processing steps. It is therefore useful to developers as an extensible NER implementation, to researchers as a standard for comparing innovative techniques, and to biologists requiring the ability to find novel entities in large amounts of text.
SEGMER is a segmental threading algorithm designed to recoginzing substructure motifs from the Protein Data Bank (PDB) library. It first splits target sequences into segments which consists of 2-4 consecutive or non-consecutive secondary structure elements (alpha-helix, beta-strand). The sequence segments are then threaded through the PDB to identify conserved substructures. It often identifies better conserved structure motifs than the whole-chain threading methods, especially when there is no similar global fold existing in the PDB.
SPRING is a template-base algorithm for protein-protein structure prediction. It first threads one chain of the protein complex through the PDB library with the binding parters retrieved from the original oligomer entries. The complex models associated with another chain is deduced from a pre-calculated look-up table, with the best orientation selected by the SPRING-score which is a combination of threading Z-score, interface contacts, and TM-align match between monomer-to-dimer templates.
SAXSTER is a new algorithm to combine small-angle x-ray scattering (SAXS) data and threading for high-resolution protein structure determination. Given a query sequence, SAXSTER first generates a list of template alignments using the MUSTER threading program from the PDB library. The SAXS data will then be used to prioritize the best template alignments based on the SAXS profile match, which are finally used for full-length atomic protein structure construction
RW (Random-Walk) is distance-dependent atomic potential for protein structure modeling and structure decoy recognition. It was derived from 1,383 high-resolution PDB structures using an ideal random-walk chain as the reference state. The RW potential has been extensively optimized and tested on a variety of protein structure decoy sets and demonstrates a significant power in protein structure recognition and a strong correlation with the RMSD of decoys to the native structures
LIBRA is based on a graph theory approach to find the largest subset of similar residues between an input protein and a collection of known functional sites.
LIBRA+ is an upgraded version of LIBRA, a tool that, given a protein’s structural model, predicts the presence and identity of active sites and/or ligand binding sites. The algorithm implemented by LIBRA+ is based on a graph theory approach to find the largest subset of similar residues between an input protein and a collection of known functional sites. For this purpose, the algorithm makes use of two predefined databases for active sites and ligand binding sites, respectively derived from the Catalytic Site Atlas and the Protein Data Bank.
LIBRA Web Application is an online portal where users can exploit LIBRA+’s capabilities in recognizing the presence and identity of active sites and/or ligand binding sites given a protein’s structural model. With a free registration, users are given a personal space where they can launch and schedule multiple recognitions, check out the resulting three-dimensional alignments and browse ligand clusters. Results produced in LIBRAWA are backward-compatible with LIBRA+ and can thus be exported in LIBRA+’s format to be accessed offline from the desktop application.