seqjoin is used to predict the complete cDNA insert sequence of partially sequenced cDNA clones. The clones’ partial experimental sequence are matched to a database of complete cDNA sequence. If a match is found, the clone’s insert sequence is predicted from the vector sequence, the sequence of the database cDNA sequence entry that was matched and the experimental, partial clone sequence. seqjoin is based on the output of the sequence analysis programs phred, phrap, cross_match and also uses the Emboss package.
Orphelia is a metagenomic ORF finding tool for the prediction of protein coding genes in short, environmental DNA sequences with unknown phylogenetic origin [1]. Orphelia is based on a two-stage machine learning approach that was recently introduced by our group. After the initial extraction of open reading frames (ORFs), linear discriminants are used to extract features from those ORFs. Subsequently, an artificial neural network combines the features and computes a gene probability for each ORF in a fragment. A greedy strategy computes a likely combination of high scoring ORFs with an overlap constraint.
NucPred (pronounced newk-pred) analyses a eukaryotic protein sequence and predicts if the protein: spends at least some time in the nucleus or spends no time in the nucleus. Don’t forget that proteins can have multiple functions and/or multiple subcellular locations. However, if a protein is already known to be secreted or is an integral membrane protein, a second role as a nuclear protein is not likely. NucPred will make a small number of confident but contradictory predictions like this. So please use all sources of biological information (both real and predicted) when interpreting the results.