Genomizer is a platform independent Java program for the analysis of genome wide association experiments.The software implements the workflow of an association experiment, including data management, single-point and haplotype analysis, “lead” definition, and data visualization.
Screen & Clean is a program that identifies associations between SNP allele count data and a continuous or binary phenotype. The core function is a screen that identifies the first K SNPs to enter an L1-penalized regression of the phenotype on the allele counts, where K is chosen by a stability criterion. The program includes several optional procedures that are turned off by default, including a pre-screen using marginal regression p-values, a second screen for pairwise interaction effects, and a multivariate regression clean of the screened SNPs. K may also be chosen directly by the user.
GenoWAP is a post-GWAS prioritization method that integrates genomic functional annotation and GWAS test statistics. After prioritization, real disease-associated loci become easier to be identified.
IMPUTE is a program for estimating (“imputing”) unobserved genotypes in SNP association studies. The program is designed to work seamlessly with the output of the genotype calling program CHIAMO and the population genetic simulator HAPGEN, and it produces output that can be analyzed using the program SNPTEST.
GWApower is a R package for assessing the power of genome-wide association studies using commercially available genotyping chips. The package encapsulates extensive simulation results generated by the program HAPGEN.
QCTOOL is a command-line utility program for basic quality control of gwas datasets. It supports the same file formats used by the WTCCC studies, as well as the binary file format described here, and is designed to work seamlessly with SNPTEST and related tools. QCTOOL computes per-sample and per-SNP summary statistics, and uses these to filter out samples and SNPs from the dataset (either by removing them from the files or by writing exclusion lists).
PLATO (PLatform for the Analysis, Translation, and Organization of large-scale data) is a system for the analysis of genome-wide association data that will incorporate several analytical approaches as filters to allow a scientist to choose whatever analytical methods they wish to apply. PLATO (PLatform for the Analysis, Translation, and Organization of large-scale data) will incorporate a number of filters to select the important SNPs in a genome-wide association study. PLATO was designed to aid in retrieving, evaluating, formatting, and analyzing genotypic and clinical data from the latest large-scale genotyping studies. PLATO implements a battery of quality control procedures to assess the data.
SNPRuler finds epistatic interactions in GWASs. SNPRuler first uses the predictive rule learning to narrow down possible interactions among SNPs and then captures true interactions using χ2 statistic test. The rule-based strategy in our non-parametric learning approach enables our new method to search for interaction patterns more efficiently than existing methods. We conduct extensive experiments on both simulated data and real genome-wide data. The experimental results demonstrate that our new learning method is a powerful tool in handling large-scale SNP data both in terms of speed and detection of potential interactions that were not identified before.
SNPHarvester detects SNP–SNP interactions in GWA studies. SNPHarvester creates multiple paths in which the visited SNP groups tend to be statistically associated with diseases, and then harvests those significant SNP groups which pass the statistical tests. It greatly reduces the number of SNPs. Consequently, existing tools can be directly used to detect epistatic interactions.