A dockerized framework for hierarchical frequency-based document clustering on cloud computing infrastructures
Abstract Scalable big data analysis frameworks are of paramount importance in the modern web society, which is characterized by a huge number of resources, including electronic text documents. Document clustering is an important field in text mining and is commonly used for document organization, br...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2020-01-01
|
Series: | Journal of Cloud Computing: Advances, Systems and Applications |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13677-019-0150-y |