fuzzyClustering – K-partite Graph Clustering algorithm that allows for Overlapping (Fuzzy) Clusters

fuzzyClustering

:: DESCRIPTION

fuzzyClustering is a fast and efficient k-partite graph clustering algorithm that allows for overlapping (fuzzy) clusters. It is based on multiplicative update rules commonly used in non-negative matrix factorization.

::DEVELOPER

Institute of Computational Biology, German Research Center for Environmental Health (GmbH)

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Windows / Linux / MacOsX
MatLab

:: DOWNLOAD

fuzzyClustering

:: MORE INFORMATION

Citation

BMC Bioinformatics. 2010 Oct 20;11:522. doi: 10.1186/1471-2105-11-522.
Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs.
Hartsperger ML, Blöchl F, Stümpflen V, Theis FJ.

boostKCP – Boosting k-means Clustering for the Pearson correlation distance

boostKCP

:: DESCRIPTION

boostKCP is a simple but powerful heuristic method for accelerating k-means clustering of large-scale data in life science.

::DEVELOPER

Morishita Laboratory

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C++

:: DOWNLOAD

boostKCP

:: MORE INFORMATION

Citation

Ichikawa K, Morishita S.
A Simple but Powerful Heuristic Method for Accelerating k-Means Clustering of Large-Scale Data in Life Science.
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):681-92. doi: 10.1109/TCBB.2014.2306200. PMID: 26356339.

Gclust 3.5.5z3 – Genome-wide Clustering

Gclust 3.5.5z3

:: DESCRIPTION

Gclust software was developed to make clusters of protein sequences from all predicted protein sequences in a selected set of genomes.

::DEVELOPER

Sato Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux / MacOsX
Perl
C++ Compiler

:: DOWNLOAD

Gclust

:: MORE INFORMATION

Citation

Bioinformatics. 2009 Mar 1;25(5):599-605. doi: 10.1093/bioinformatics/btp047.
Gclust: trans-kingdom classification of proteins using automatic individual threshold setting.
Sato N.

DomClust – Hierarchical Clustering for Orthologous Domain Classification

DomClust

:: DESCRIPTION

DomClust is an effective tool for orthologous grouping in multiple genomes, which is a crucial first step in large-scale comparative genomics.

::DEVELOPER

Ikuo Uchiyama (uchiyama@nibb.ac.jp)

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C Compiler

:: DOWNLOAD

DomClust

:: MORE INFORMATION

Citation

Nucleic Acids Res. 2006 Jan 25;34(2):647-58.
Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes.
Uchiyama I.

FreClu – Efficient Frequency-based De novo Short Read Clustering

FreClu

:: DESCRIPTION

FreClu: Efficient Frequency-based De novo Short Read Clustering — de novo clustering

::DEVELOPER

Morishita Laboratory

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux / Windows / Mac OsX
Java

:: DOWNLOAD

FreClu

:: MORE INFORMATION

Citation

Wei Qu, Shin-ichi Hashimoto and Shinichi Morishita
Efficient frequency-based de novo short read clustering for error trimming in next-generation sequencing.
Genome Res. 2009. 19:1309-1315

MSBE 1.0.5 – Analysis of Gene Expression data using a new bi-clustering method

MSBE 1.0.5

:: DESCRIPTION

MSBE is a tool for the analysis of gene expression data using a new bi-clustering method. It can find constant bi-clusters and additive bi-clusters.

::DEVELOPER

Lusheng Wang

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Windows / Linux
Java

:: DOWNLOAD

MSBE

:: MORE INFORMATION

Citation

Computing the maximum similarity bi-clusters of gene expression data.
Liu X, Wang L.
Bioinformatics. 2007 Jan 1;23(1):50-6.

pClust 1.0 – Parallel Identification of Dense Protein Clusters

pClust 1.0

:: DESCRIPTION

PClust is a scalable parallel software for detecting dense subgraphs.

::DEVELOPER

Ananth Kalyanaraman

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux
C Compiler

:: DOWNLOAD

pClust

:: MORE INFORMATION

Citation

C. Wu, A. Kalyanaraman.
An efficient parallel approach for identifying protein families in large-scale metagenomic data sets.
Proc. ACM/IEEE Supercomputing Conference (SC’08), Austin, TX, November 15-21. pp. 1-10. 2008

QDB 1.1 – Query Driven Biclustering

QDB 1.1

:: DESCRIPTION

QDB (Query Driven Biclustering) is a Bayesian query-driven biclustering framework for microarray data in which the prior distributions allow introducing knowledge from a set of seed genes (query) to guide the pattern search.

::DEVELOPER

Kathleen Marchal

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux / Windows / MacOsX
R package

:: DOWNLOAD

QDB

:: MORE INFORMATION

Citation

Dhollander,T. et al. (2007)
Query-driven module discovery in microarray data.
Bioinformatics, 23, 2573-2580.

EnsemblQDB – Ensembl Query-based Biclustering

EnsemblQDB

:: DESCRIPTION

Ensembl query-based biclustering : Query-based biclustering techniques allow interrogating a gene expression compendium with a given gene or gene list. They do so by searching for genes in the compendium that have a profile close to the average expression profile of the genes in this query-list. As it can often not be guaranteed that the genes in a long query-list will all be mutually coexpressed, it is advisable to use each gene separately as a query. This approach, however, leaves the user with a tedious post-processing of partially redundant biclustering results. The fact that for each query-gene multiple parameter settings need to be tested in order to detect the ?most optimal bicluster size? adds to the redundancy problem. To aid with this post-processing, we developed an ensemble approach to be used in combination with query-based biclustering.

::DEVELOPER

Prof. Dr. Kathleen Marchal

:: SCREENSHOTS

N/A

:: REQUIREMENTS

Linux / Windows/ MacOsX
Matlab

:: DOWNLOAD

EnsemblQDB

:: MORE INFORMATION

Citation:

De Smet, R. and Marchal, K. (2011).
An ensemble biclustering approach for querying gene expression compendia with experimental lists.
Bioinformatics. 2011 Jul 15;27(14):1948-56.

MCRL v1.01 – Metagenomic Clustering by Reference Library

MCRL v1.01

:: DESCRIPTION

MCLR is a data mining tool that can be used to probe a metagenome for homologs of a pre-defined reference library.

::DEVELOPER

MCRL team

:: SCREENSHOTS

N/A

::REQUIREMENTS

Linux
Matlab

:: DOWNLOAD

MCLR

:: MORE INFORMATION

Citation

Tadmor AD, Phillips R.
MCRL: using a reference library to compress a metagenome into a nonredundant list of sequences, considering viruses as a case study.
Bioinformatics. 2021 Oct 12:btab703. doi: 10.1093/bioinformatics/btab703. Epub ahead of print. PMID: 34636854.