CNIT is designed for Affymetrix GeneChip to analyze copy number of each SNP allele. CNIT can be applicable in chromosome-abnormal disease, cancer and copy number variation studies, and can provide accurate CN estimations with low false-positive rate.
To estimate the reliability of bacterial microarray experiments, OpWise uses the agreement of measurements within operons to estimate the amount of systematic bias in the data. OpWise relies on the MicrobesOnline operons predictions.
ArrayCluster is one of the significant challenges in gene expression analysis to find unknown subtypes of several diseases at the molecular levels. This task can be addressed by grouping gene expression patterns of the collected samples on the basis of a large number of genes. Application of commonly used clustering methods to such a dataset however are likely to fail due to over-learning, because the number of samples to be grouped is much smaller than the data dimension which is equal to the number of genes involved in the dataset. To overcome such difficulty, we developed a novel model-based clustering method, referred to as the mixed factors analysis.
Mixer is a mixture model approach to analyze ChIP-chip or ChIP-seq data, also with some utility functions to process DNA sequence data. It includes statistical methods for both data normalization and peak detection. The peak detection and quantification relies on a mixer model approach that dissects the distribution of background signals and the Immunoprecipitated signals. In contrast to many existing methods, mixer is more flexible by imposing less restrictive assumptions and allowing a relatively large proportion of peak regions. Robust performance on data sets predicted to contain numerous peaks is very important for the studies of the transcription factors with abundant binding sites, and common chromatin features or epigenetic marks.
TightClust applies K-means clustering as an intermediate clustering engine. Early truncation of a hierarchical clustering tree is used to overcome the local minimum problem in K-means clustering. The tightest and most stable clusters are identified in a sequential manner through an analysis of the tendency of genes to be grouped together under repeated resampling.