TAMO (Tools for Analysis of MOtifs) is developed around a unified motif representation of a position-specific scoring matrix (PSSM). Motif objects may be assembled from IUPAC-ambiguity codes, multiple sequence alignments, averages of other motifs, and matrices of frequencies or log-likelihood values. Motifs can printed, concatenated, indexed and sliced like text strings, or rendered as sequence logos. They can also be randomized, reverse-complemented, and recomputed using different assumptions about background base frequencies. Motifs can also store and report information about their origin, information content, and score. Finally, motifs can scan DNA sequences for instances of matching sites.
Oliver C, Mallet V, Philippopoulos P, Hamilton WL, Waldispühl J. VeRNAl: A Tool for Mining Fuzzy Network Motifs in RNA.
Bioinformatics. 2021 Nov 15:btab768. doi: 10.1093/bioinformatics/btab768. Epub ahead of print. PMID: 34791045.
Haystack is a suite of computational tools implemented in a Python 2.7 package called haystack_bio to study epigenetic variability, cross-cell-type plasticity of chromatin states and transcription factors (TFs) motifs providing mechanistic insights into chromatin structure, cellular identity and gene regulation.
eCAMI is a Python package: (i) has the best performance in terms of accuracy and memory use for CAZyme and enzyme EC classification and annotation; (ii) the k-mer-based tools (including PPR-Hotpep, CUPP and eCAMI) perform better than homology-based tools and deep-learning tools in enzyme EC prediction.
Cluster-Buster is the third generation program for finding clusters of pre-specified motifs in nucleotide sequences. The main application is detection of sequences that regulate gene transcription, such as enhancers and silencers, but other types of biological regulation may be mediated by motif clusters too.
ROVER (Relative OVER-abundance of cis-elements) is a tool for determining if one or more of a group of transcription factors is likely to regulate a group of genes. It was designed for use with promoters from groups of genes that are suspected of being co-regulated, such as those from a microarray study. ROVER compares two groups of promoters (a suspected co-regulated group and a non-regulated group) by determining the relative over-abundance of likely binding sites for a particular Transcription Factor (TF) in one group versus the other. ROVER calculates the significance of any over-abundance of binding sites for each TF and reports a probability of its chance occurrence. This can be interpreted as the probability that a given TF regulates the group of genes in question. Likely binding sites are found by looking for high-scoring matches to a Position Specific Weight Matrix (PSSM), which represents known binding sites for a transcription factor. In addition to determining the significance of each TF, ROVER also provides the subset of sequences likely to be regulated by each TF and the specific significant binding sites.
Clover (Cis-eLement OVERrepresentation) is a program for identifying functional sites in DNA sequences. If you give it a set of DNA sequences that share a common function, it will compare them to a library of sequence motifs (e.g. transcription factor binding patterns), and identify which if any of the motifs are statistically overrepresented in the sequence set.