Determining number of clusters using firefly algorithm with cluster merging for text clustering

Text mining, in particular the clustering is mostly used by search engines to increase the recall and precision of a search query.The content of online websites (text, blogs, chats, news,etc.) are dynamically updated, nevertheless relevant information on the changes made are not present. Such a scen...

Full description

Bibliographic Details
Main Authors: Mohammed, Athraa Jasim, Yusof, Yuhanis, Husni, Husniza
Other Authors: Zaman, Halimah Badioze
Format: Book Section
Published: Springer International Publishing 2015
Subjects:
Description
Summary:Text mining, in particular the clustering is mostly used by search engines to increase the recall and precision of a search query.The content of online websites (text, blogs, chats, news,etc.) are dynamically updated, nevertheless relevant information on the changes made are not present. Such a scenario requires a dynamic text clustering method that operates without initial knowledge on a data collection.In this paper, a dynamic text clustering that utilizes Firefly algorithm is introduced.The proposed, aFAmerge, clustering algorithm automatically groups text documents into the appropriate number of clusters based on the behavior of firefly and cluster merging process. Experiments utilizing the proposed aFAmerge were conducted on two datasets; 20Newsgroups and Reuter’s news collection.Results indicate that the aFAmerge generates a more robust and compact clusters than the ones produced by Bisect K-means and practical General Stochastic Clustering Method (pGSCM).