ICSNPathway is a web server developed to discover candidate causal SNPs and corresponding candidate causal pathways from genome-wide association study (GWAS).
::DEVELOPER
Bioinformatics Lab, Institute of Psychology, Chinese Academy of Sciences
VAAST (the Variant Annotation, Analysis and Search Tool) is a probabilistic search tool for identifying damaged genes and their disease-causing variants in personal genome sequences. VAAST builds upon existing amino acid substitution (AAS) and aggregative approaches to variant prioritization, combining elements of both into a single unified likelihood-framework that allows users to identify damaged genes and deleterious variants with greater accuracy, and in an easy-to-use fashion. VAAST can score both coding and non-coding variants, evaluating the cumulative impact of both types of variants simultaneously. VAAST can identify rare variants causing rare genetic diseases, and it can also use both rare and common variants to identify genes responsible for common diseases. VAAST thus has a much greater scope of use than any existing methodology.
DISTILLER (Data Integration System To Identify Links in Expression Regulation) is a data integration framework that searches for transcriptional modules by combining expression data with information on the direct interaction between a regulator and its corresponding target genes. The framework builds upon advanced itemset mining approaches that have been designed to have good scalability, efficient memory use, and a small number of user parameters. It includes a condition selection or bicluster strategy in which co-expression of genes is required in only a significant subset of the complete condition set. By including this condition selection we can apply the algorithm to large expression compendia where interesting genes are not necessarily co-expressed in all measured conditions. Our approach also makes it straightforward to include any number of data sources related to transcriptional interactions such as additional microarrays, ChIP-chip or motif data.
MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity and extends the software by incorporating 15 utility programs for display and further analyses.
MCScanX-transposed: detecting transposed gene duplications based on multiple collinearity scans
::DEVELOPER
Haibao Tang : bao at uga dot edu and Yupeng Wang: wyp1125@gmail.com at Plant Genome Mapping Laboratory, University of Georgia
Socrates is a highly efficient and effective method for detecting genomic rearrangements in tumours that utilises split-read data. Socrates features single nucleotide resolution, high sensitivity, and high specificity in simulated data.
pdCSM-PPI is a machine learning approach that uses a graph-based representation of small molecules to guide identification of inhibitors modulating protein-protein interactions.
V-Phaser is a tool to call variants in genetically heterogeneous populations from ultra-deep sequence data. V-Phaser combines information regarding the covariation (i.e. phasing) between observed variants to increase sensitivity and an expectation maximization algorithm that iteratively recalibrates base quality scores to increase specificity.
V-Profiler takes a read alignment and a list of accepted variants at each location in the alignment (such as would be generated by V-Phaser) and analyzes the intra-host diversity of a genome. This can be done at the nucleotide level over the whole sequence, at the codon level for each gene specified in a list, and at the haplotype level for any region delimited (note that the region must not exceed a read length, and is preferably of shorter length such as an epitope or a loop of interest).
GenoCN is a software that simultaneously identify copy number states and genotype calls. Different strategies are implemented for the study of Copy Number Variations (CNVs) and Copy Number Aberrations (CNAs). While CNVs are naturally occurring and inheritable, CNAs are acquired somatic alterations most often observed in tumor tissues only. CNVs tend to be short and more sparsely located in the genome compared to CNAs. GenoCN consists of two components, genoCNV and genoCNA, designed for CNV and CNA studies, respectively. In contrast to most existing methods, genoCN is more flexible in that the model parameters are estimated from the data instead of being decided a priori. genoCNA also incorporates two important strategies for CNA studies. First, the effects of tissue contamination are explicitly modeled. Second, if SNP arrays are performed for both tumor and normal tissues of one individual, the genotype calls from normal tissue are used to study CNAs in tumor tissue.
Meerkat is designed to identify structure variations (SVs) from paired end high throughput sequencing data. It predicts SVs from discordant read pairs (pairs that mapped to reference genome in unexpected way).