RabbitTClust: enabling fast clustering analysis of millions of bacteria genomes with MinHash sketches
Abstract We present RabbitTClust, a fast and memory-efficient genome clustering tool based on sketch-based distance estimation. Our approach enables efficient processing of large-scale datasets by combining dimensionality reduction techniques with streaming and parallelization on modern multi-core p...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-05-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-023-02961-6 |