KCOSS 2021 – A K-mer Frequency Statistics Software

KCOSS 2021

:: DESCRIPTION

KCOSS is used to count the sequence files in FASTA format and save the statistical results as binary files to save space.

::DEVELOPER

KCOSS team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

KCOSS

:: MORE INFORMATION

Citation:

Tang D, Li Y, Tan D, Fu J, Tang Y, Lin J, Zhao R, Du H, Zhao Z.
KCOSS: an ultra-fast k-mer counter for assembled genome analysis.
Bioinformatics. 2021 Nov 26:btab797. doi: 10.1093/bioinformatics/btab797. Epub ahead of print. PMID: 34849595.

GECKO – GEnetic Classification using k-mer Optimization

GECKO

:: DESCRIPTION

GECKO is a genetic algorithm toclassify and extract meaningful sequences from multiple types of sequencing approaches including mRNA, microRNA, and DNA methylome data.

::DEVELOPER

GECKO team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • Java

:: DOWNLOAD

GECKO

:: MORE INFORMATION

Citation

Thomas A, Barriere S, Broseus L, Brooke J, Lorenzi C, Villemin JP, Beurier G, Sabatier R, Reynes C, Mancheron A, Ritchie W.
GECKO is a genetic algorithm to classify and explore high throughput sequencing data.
Commun Biol. 2019 Jun 20;2:222. doi: 10.1038/s42003-019-0456-9. PMID: 31240260; PMCID: PMC6586863.

Turtle 0.3.1 – Identifying Frequent k-mers with Cache-efficient Algorithms

Turtle 0.3.1

:: DESCRIPTION

Turtle is a novel method that balances time, space and accuracy requirements to efficiently extract frequent k-mers even for high coverage libraries and large genomes such as human.

::DEVELOPER

Schliep lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • GCC

:: DOWNLOAD

 Turtle

:: MORE INFORMATION

Citation

Roy, Rajat S. and Bhattacharya, Debashish and Schliep , Alexander.
Turtle: Identifying frequent k-mers with cache-efficient algorithms 
Bioinformatics. 2014 Apr 2.

SKESA 2.4.0 – Strategic K-mer Extension for Scrupulous Assemblies

SKESA 2.4.0

:: DESCRIPTION

SKESA is a de-novo sequence read assembler for microbial genomes. It uses conservative heuristics and is designed to create breaks at repeat regions in the genome.

::DEVELOPER

NCBI – National Center for Biotechnology Information

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

SKESA

:: MORE INFORMATION

Citation

Souvorov A, Agarwala R.
SAUTE: sequence assembly using target enrichment.
BMC Bioinformatics. 2021 Jul 21;22(1):375. doi: 10.1186/s12859-021-04174-9. PMID: 34289805; PMCID: PMC8293564.

Souvorov A, Agarwala R, Lipman DJ.
SKESA: strategic k-mer extension for scrupulous assemblies.
Genome Biol. 2018 Oct 4;19(1):153. doi: 10.1186/s13059-018-1540-z. PMID: 30286803; PMCID: PMC6172800.

Phy-Mer V1.0- Mitochondrial Genome Haplogroup defining algorithm using a K-mer approach

Phy-Mer V1.0

:: DESCRIPTION

Phy-Mer is a novel alignment-free and reference-independent mitochondrial haplogroup classifier.

::DEVELOPER

MEEI Bioinformatics Center (MBC)

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • Python

:: DOWNLOAD

  Phy-Mer

:: MORE INFORMATION

Citation

Phy-Mer: a novel alignment-free and reference-independent mitochondrial haplogroup classifier.
Navarro-Gomez D, Leipzig J, Shen L, Lott M, Stassen AP, Wallace DC, Wiggs JL, Falk MJ, van Oven M, Gai X.
Bioinformatics. 2014 Dec 12. pii: btu825

BFCounter 0.2 – K-mer Counting Software

BFCounter 0.2

:: DESCRIPTION

BFCounter is a program for counting k-mers from DNA sequencing data it uses a Bloom filter data structure to filter unique k-mers, likely generated from sequencing errors. Counting k-mers (substrings of length k) is an essential compononet of many methods in bioinformatics, including for genome and transcriptome assembly, for metagenomic sequencing, and for error correction of sequence reads. Although simple in principle, counting k-mers in large modern sequence data sets can easily overwhelm the memory capacity of standard computers. In current data sets, a large fraction – often more than 50% – of the storage capacity may be spent on storing k-mers that contain sequencing errors and which are typically observed only a single time in the data. These singleton k-mers are uninformative for many algorithms without some kind of error correction.

::DEVELOPER

Pritchard Lab

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 BFCounter

:: MORE INFORMATION

Citation

Melsted, P. and Pritchard, J.K.:
Efficient counting of k-mers in DNA sequences using a bloom filter
BMC Bioinformatics 2011 12:333.

PLEK 1.2 – Predictor of Long Non-coding RNAs and mRNAs based on K-mer Scheme

PLEK 1.2

:: DESCRIPTION

PLEK uses an improved computational pipeline based on k-mer and support vector machine (SVM) to distinguish long non-coding RNAs (lncRNAs) from messager RNAs (mRNAs).

::DEVELOPER

PLEK team

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux
  • C/C++ compiler
  • Python

:: DOWNLOAD

 PLEK

:: MORE INFORMATION

Citation:

BMC Bioinformatics. 2014 Sep 19;15:311. doi: 10.1186/1471-2105-15-311.
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.
Li A, Zhang J1, Zhou Z..

KMC 3.0 – K-mer Counter

KMC 3.0

:: DESCRIPTION

KMC is a utility designed for counting k-mers (sequences of consecutive k symbols) in a set of reads from genome sequencing projects.

::DEVELOPER

REFRESH Bioinformatics Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux / Windows/ MacOsX

:: DOWNLOAD

 KMC

:: MORE INFORMATION

Citation

S. Deorowicz, M. Kokot, Sz. Grabowski, A. Debudaj-Grabysz,
KMC 2: Fast and resource-frugal k-mer counting.
Bioinformatics. 2015 Jan 20. pii: btv022.

S. Deorowicz, A. Debudaj-Grabysz, Sz. Grabowski,
Disk-based k-mer counting on a PC.
BMC Bioinformatics 2013, 14:160. doi:10.1186/1471-2105-14-160

DSK 2.3.3 – K-mer Counting software

DSK 2.3.3

:: DESCRIPTION

DSK is a k-mer counting software, similar to Jellyfish. Jellyfish is very fast but limited to large-memory servers and k ≤ 32. In contrast, DSK supports large values of k, and runs with (almost-)arbitrarily low memory usage and reasonably low temporary disk usage. DSK can count k-mers of large Illumina datasets on laptops and desktop computers.

::DEVELOPER

Rayan CHIKHI

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux

:: DOWNLOAD

 DSK

:: MORE INFORMATION

Citation

DSK: k-mer counting with very low memory usage.
Rizk G, Lavenier D, Chikhi R.
Bioinformatics. 2013 Mar 1;29(5):652-3. doi: 10.1093/bioinformatics/btt020. Epub 2013 Jan 16.

KmerGenie 1.7051 – K-mer size Selection for Genome Assembly

KmerGenie 1.7051

:: DESCRIPTION

KmerGenie estimates the best k-mer length for genome de novo assembly. Given a set of reads, KmerGenie first computes the k-mer abundance histogram for many values of k. Then, for each value of k, it predicts the number of distinct genomic k-mers in the dataset, and returns the k-mer length which maximizes this number. Experiments show that KmerGenie’s choices lead to assemblies that are close to the best possible over all k-mer lengths.

::DEVELOPER

Rayan Chikhi, Medvedev Group

:: SCREENSHOTS

N/A

:: REQUIREMENTS

  • Linux/ Windows / MacOsX
  • R package
  • Python

:: DOWNLOAD

 KmerGenie

:: MORE INFORMATION

Citation:

Bioinformatics. 2014 Jan 1;30(1):31-7. doi: 10.1093/bioinformatics/btt310. Epub 2013 Jun 3.
Informed and automated k-mer size selection for genome assembly.
Chikhi R1, Medvedev P.