mobility 0.1 – Genomic Fluidity Perl & MATLAB Scripts

mobility 0.1

:: DESCRIPTION

mobility is a collection of Perl scripts to calculate gene-level similarity among annotated genomes. The scripts can be executed from the command line and the only dependencies are BioPerl and NCBI BLAST. Separately, we include a MATLAB script to calculate fluidity and its variance directly from matrices of shared and total gene counts.

::DEVELOPER

WeitzGroup@GeorgiaTech

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

mobility

:: MORE INFORMATION

Citation:

Kislyuk et al.
Genomic fluidity: an integrative view of gene diversity within microbial populations
BMC Genomics 12: 32 (2011).

GeneMark 2.5 – Gene Prediction Programs

GeneMark 2.5

:: DESCRIPTION

GeneMark developed in 1993 was the first gene finding method recognized as an efficient and accurate tool for genome projects. GeneMark was used for annotation of the first completely sequenced bacteria, Haemophilus influenzae, and the first completely sequenced archaea, Methanococcus jannaschii. The GeneMark algorithm uses species specific inhomogeneous Markov chain models of protein-coding DNA sequence as well as homogeneous Markov chain models of non- coding DNA. Parameters of the models are estimated from training sets of sequences of known type. The major step of the algorithm computes a posteriory probability of a sequence fragment to carry on a genetic code in one of six possible frames (including three frames in complementary DNA strand) or to be “non-coding”

GeneMark is documented as the most accurate prokaryotic gene finder.

GeneMark.hmm-P and GeneMark.hmm-E programs are predicting genes and intergenic regions in a sequence as a whole. They use the Hidden Markov models reflecting the “grammar” of gene organization.

The GeneMark.hmm (P and E) programs identify the maximum likely parse of the whole DNA sequence into protein coding genes (with possible introns) and intergenic regions.

To analyze ESTs and cDNAs you can use GeneMark-E.

::DEVELOPER

Mark Borodovsky , Georgia Institute of TechnologyAtlanta, Georgia, USA

:: REQUIREMENTS

  • Linux / Mac OsX

:: DOWNLOAD

GeneMark

:: MORE INFORMATION

Citation

Borodovsky M. and McIninch J.
GeneMark: parallel gene recognition for both DNA strands,
Computers & Chemistry, 1993, Vol. 17, No. 19, pp. 123-133.

Besemer J., Lomsadze A. and Borodovsky M.,
GeneMarkS: a self-training method for predicition of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.
Nucleic Acids Research, 2001, Vol. 29, No. 12, 2607-2618

AGMIAL – Annotate Microbial Genomes

AGMIAL

:: DESCRIPTION

AGMIAL is an integrated system for bacterial genome annotation. It is currently used at INRA for the newly sequenced bacterial genomes : Lactobacillus bulgaricus, Lactobacillus sakei and Flavobacterium psychrophilum, as well as the re-annotation of Lactococcus lactis, Enterococcus faecalis and faecium.

::DEVELOPER

AGMIAL Team at Jouy-en-Josas Cedex

:: SCREENSHOTS

:: REQUIREMENTS

  • Windows / Linux / Mac OsX
  • Java

:: DOWNLOAD

AGMIAL Source Code

:: MORE INFORMATION

Citation:

K. Bryson, V. Loux, R. Bossy, P. Nicolas, S. Chaillou, M. van de Guchte, S. Penaud, E. Maguin JF. Gibrat.
AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system.
Nucleic Acids Research. Jul 2006.

 

BRA – Binary Repeat Align

BRA

:: DESCRIPTION

BRA (Binary Repeat Align) is a software for aligning tandem repeat regions for which the repeats can be treated as marked by mutations. Tandem repeat regions are abundant in many genomes. Normally not very informative, except for various fingerprinting techniques using length statistics, regions where the repeats are marked, that is, there are slight variations of the basic repeat, remarkable patterns can occur. These patterns can be utilized for evolutionary analysis.

::DEVELOPER

Lars Arvestad

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

BRA

:: MORE INFORMATION

Citation:

Savolainen P, Arvestad L, Lundeberg J (2000a) mtDNA tandem repeats in domestic dogs and wolves: mutation mechanism studied by analysis of the sequence of imperfect repeats. Mol Biol Evol 17(4), 474—478
Savolainen P, Arvestad L, Lundeberg J (2000b) A novel method for forensic DNA investigations: repeat-type sequence analysis of tandemly repeated mtDNA in domestic dogs. J Forensic Sci 45(5), 990—999

avdist 1.0 – Analyze Haplotype Differences

avdist 1.0

:: DESCRIPTION

avdist is a simple tool for bootstrap analysis of haplotype differences. It computes hamming distance between pairs of sequences sampled from the input sequences and presents average difference and standard devitation of the results after some number of iterations. Indels are discarded from the distance calculation.

::DEVELOPER

Lars Arvestad

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux/ windows/MacOsX
  • Perl

:: DOWNLOAD

avdist

:: MORE INFORMATION

avdist is distributed under the GNU General Public License

MRSfinder 1.0 – Find MAR Recognition Signature in DNA Sequence

MRSfinder 1.0

:: DESCRIPTION

MRSfinder is a program to find the Matrix Attachment Region (MAR) Recognition Signature in DNA sequence. The Matrix Attachment Region (MAR) Recognition Signature (MRS), defined by van Drunen et al. (Nucleic Acids Research 1999, 27:2924-30), has been proposed as a motif characteristic of MAR. The signature is composed of two motifs (AATAAYAA and AWWRTAANNWWGNNNC (one mis-match allowed)) which lie with 200bp of each other, on either strand of the DNA duplex. MRSfinder.pl searches a user defined FASTA sequence file for all instances of the MRS and reports their positions.

::DEVELOPER

The Blaxter Lab at The Institute of Evolutionary Biology University of Edinburgh

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

MRSfinder

:: MORE INFORMATION

annot8r 1.1.1 – BLAST based GO-EC-KEGG Annotation

annot8r 1.1.1

:: DESCRIPTION

annot8r is a tool for the annotation of protein or nucleotide sequences from non model organisms with GO terms, EC numbers and KEGG pathways. The annotation is based on BLAST similarity searches against annotated subsets of EMBL UniProt from which sequences with non-informative entries have been removed. GO, EC and KEGG annotations are saved as flat files and in a relational postgreSQL database to allow for more sophisticated searches within the results.

::DEVELOPER

Ralf Schmid and Mark Blaxter

:: REQUIREMENTS

:: DOWNLOAD

annot8r

:: MORE INFORMATION

wwwPartiGene 0.1 – Building a Web Interface to Partigene Databases

wwwPartiGene 0.1

:: DESCRIPTION

WebPartiGene is a tool that generates HTML, php and CGI scripts that together form a web based interface to PartiGene relational databases. WebPartiGene allows specific clusters to be retrieved from the PartiGene database by entering either a cluster identifier or an EST idenitifier. The BLAST annotation text can also be searched and limited by BLAST score. Selected clusters are displayed in graphical format, showing the alignment of constituent sequences. The display includes links to full BLAST reports, phrap assembly quality files and public depository database files. WebPartiGene is full compatible with mulit-species PartiGene databases.

::DEVELOPER

The Blaxter Lab at The Institute of Evolutionary Biology University of Edinburgh

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • Web server

:: DOWNLOAD

wwwPartiGene

:: MORE INFORMATION

N/A

CLOBB 2.0 – Cluster Sequences on the Basis of BLAST

CLOBB 2.0

:: DESCRIPTION

CLOBB (Cluster on the basis of BLAST similarity) takes a set of DNA sequences and clusters them into groups which putatively derive from the same gene. In order to operate, the user must have BLASTALL in their path. The output is a blastable fasta file named <cluster_id>EST, where cluster_id is given by the user, which contails a list of sequences with identifiers <cluster_id>00001 to <cluster_id>99999.

::DEVELOPER

John Parkinson (john.parkinson@ed.ac.uk) and Mark Blaxter , Institute of Cell, Animal and Population Biology, University of Edinburgh

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

CLOBB

:: MORE INFORMATION

Citation:

John Parkinson , David B Guiliano and Mark Blaxter
Making sense of EST sequences by CLOBBing them
BMC Bioinformatics 2002, 3:31

trace2seq 1.0.1 – Process Capillary Sequencing Traces

trace2seq 1.0.1

:: DESCRIPTION

trace2seq process raw sequencing chromatograph trace files into quality-checked sequences. trace2seq is a PERL script, designed to be run friom the command line.

::DEVELOPER

The Blaxter Lab at The Institute of Evolutionary Biology University of Edinburgh

:: SCREENSHOTS

N/A

:: REQUIREMENTS

:: DOWNLOAD

trace2seq

:: MORE INFORMATION