RabbitTClust: enabling fast clustering analysis of millions of bacteria genomes with MinHash sketches

Abstract We present RabbitTClust, a fast and memory-efficient genome clustering tool based on sketch-based distance estimation. Our approach enables efficient processing of large-scale datasets by combining dimensionality reduction techniques with streaming and parallelization on modern multi-core p...

Full description

Bibliographic Details
Main Authors: Xiaoming Xu, Zekun Yin, Lifeng Yan, Hao Zhang, Borui Xu, Yanjie Wei, Beifang Niu, Bertil Schmidt, Weiguo Liu
Format: Article
Language:English
Published: BMC 2023-05-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-02961-6