UNWORDS is an efficient software for computing shortest strings which do not occur in a given set of DNA sequences. UNWORDS is more efficient than previous algorithms and easier to use. It uses bit vector encoding of strings and therefore we called it unwords-bits. It directly computes unwords without the need to specify a length estimate. Moreover, it avoids the space requirements of index structures such as suffix trees and suffix arrays.
:: MORE INFORMATION
Herold, J. and Kurtz, S. and Giegerich, R.
Efficient Computation of Absent Words in Genomic Sequences,
BMC Bioinformatics, 2008, 9:167.