Kssd: sequence dimensionality reduction by k-mer substring space sampling enables real-time large-scale datasets analysis
Abstract Here, we develop k -mer substring space decomposition (Kssd), a sketching technique which is significantly faster and more accurate than current sketching methods. We show that it is the only method that can be used for large-scale dataset comparisons at population resolution on simulated a...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-03-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-021-02303-4 |