Sampbias is a method and tool to 1) visualize the distribution of occurrence records and species in any user-provided dataset, 2) quantify the biasing effect of geographic features related to human accessibility, such as proximity to cities, rivers or roads, and 3) create publication-level graphs of these biasing effects in space.
fitGCP is a framework for fitting mixtures of probability distributions to genome coverage profiles. Besides commonly used distributions, fitGCP uses distributions tailored to account for common artifacts. The mixture models are iteratively fitted based on the Expectation-Maximization algorithm.
InVEx (Introns Vs Exons) is a permutation-based method for ascertaining genes with a somatic mutation distribution showing evidence of positive selection for non-silent mutations. The method was developed for use in cancer genomics studies, with particular relevance to high mutation rate cancers. Mutations are permuted on a per-patient, per-trinucleotide-context basis across the exons, introns and UTRs of a gene, generating a null model of the distribution of mutations to which the observed distribution can be compared to determine statistical significance. Significant genes are of interest, as their somatic mutation is likely to be important in the formation of the cancer being studied. The method can operate on whole exome as well as whole genome sequencing data.
GMM (Gaussian Mixture Model) detects copy number variation from the distribution of copy number ratios. From the data, it will fit one component for each of the following copy number states: deletion, copy-neutral, 1 and 2 additional copy; with a constraint on the difference between the mixture means. Then for a given individual, it will determine the probabilities for each copy number state and compute the expected copy number (dosage).
MM-DIST is a computer program that calculates probability distributions for how many loci individuals in a population will differ by. For example, the graph below shows the probability of two individuals differing (mismatching) by 0 to 10 loci for unrelated individuals (solid line) and full siblings (dashed line) in a population of bighorn sheep.