Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016

Introduction. The Australian Embassy in Jakarta is storing a wide array of media release document. Analyzing particular and vital patterns of the documents collection is imperative as it will result in new insights and knowledge of significant topic groups of the documents. Methodology. K-Means was...

Full description

Bibliographic Details
Main Authors:	Wishnu Hardi, Wisnu Ananta Kusuma, Sulistyo Basuki
Format:	Article
Language:	English
Published:	Universitas Gadjah Mada 2019-11-01
Series:	Berkala Ilmu Perpustakaan dan Informasi
Subjects:	text mining document clustering k-means algorithm, cosine similarity
Online Access:	https://jurnal.ugm.ac.id/bip/article/view/36451

_version_	1828443340768542720
author	Wishnu Hardi Wisnu Ananta Kusuma Sulistyo Basuki
author_facet	Wishnu Hardi Wisnu Ananta Kusuma Sulistyo Basuki
author_sort	Wishnu Hardi
collection	DOAJ
description	Introduction. The Australian Embassy in Jakarta is storing a wide array of media release document. Analyzing particular and vital patterns of the documents collection is imperative as it will result in new insights and knowledge of significant topic groups of the documents. Methodology. K-Means was used algorithm as a non-hierarchical clustering method which partitioning data objects into clusters. The method works through minimizing data variation within cluster and maximizing data variation between clusters. Data Analysis. Of the documents issued between 2006 and 2016, 839 documents were examined in order to determine term frequencies and to generate clusters. Evaluation was conducted by nominating an expert to validate the cluster result. Results and discussions. The result showed that there were 57 meaningful terms grouped into 3 clusters. “People to people links”, “economic cooperation”, and “human development” were chosen to represent topics of the Australian Embassy Jakarta media releases from 2006 to 2016. Conclusions. Text mining can be used to cluster topic groups of documents. It provides a more systematic clustering process as the text analysis is conducted through a number of stages with specifically set parameters.
first_indexed	2024-12-10T21:27:54Z
format	Article
id	doaj.art-bdfe60eb5c9e469a82dbfc41a7bd9d71
institution	Directory Open Access Journal
issn	1693-7740 2477-0361
language	English
last_indexed	2024-12-10T21:27:54Z
publishDate	2019-11-01
publisher	Universitas Gadjah Mada
record_format	Article
series	Berkala Ilmu Perpustakaan dan Informasi
spelling	doaj.art-bdfe60eb5c9e469a82dbfc41a7bd9d712022-12-22T01:32:56ZengUniversitas Gadjah MadaBerkala Ilmu Perpustakaan dan Informasi1693-77402477-03612019-11-0115222623810.22146/bip.3645124666Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016Wishnu Hardi0Wisnu Ananta Kusuma1Sulistyo Basuki2National Library of AustraliaInstitut Pertanian BogorUniversitas IndonesiaIntroduction. The Australian Embassy in Jakarta is storing a wide array of media release document. Analyzing particular and vital patterns of the documents collection is imperative as it will result in new insights and knowledge of significant topic groups of the documents. Methodology. K-Means was used algorithm as a non-hierarchical clustering method which partitioning data objects into clusters. The method works through minimizing data variation within cluster and maximizing data variation between clusters. Data Analysis. Of the documents issued between 2006 and 2016, 839 documents were examined in order to determine term frequencies and to generate clusters. Evaluation was conducted by nominating an expert to validate the cluster result. Results and discussions. The result showed that there were 57 meaningful terms grouped into 3 clusters. “People to people links”, “economic cooperation”, and “human development” were chosen to represent topics of the Australian Embassy Jakarta media releases from 2006 to 2016. Conclusions. Text mining can be used to cluster topic groups of documents. It provides a more systematic clustering process as the text analysis is conducted through a number of stages with specifically set parameters.https://jurnal.ugm.ac.id/bip/article/view/36451text miningdocument clusteringk-means algorithm, cosine similarity
spellingShingle	Wishnu Hardi Wisnu Ananta Kusuma Sulistyo Basuki Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016 Berkala Ilmu Perpustakaan dan Informasi text mining document clustering k-means algorithm, cosine similarity
title	Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016
title_full	Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016
title_fullStr	Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016
title_full_unstemmed	Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016
title_short	Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016
title_sort	clustering topic groups of documents using k means algorithm australian embassy jakarta media releases 2006 2016
topic	text mining document clustering k-means algorithm, cosine similarity
url	https://jurnal.ugm.ac.id/bip/article/view/36451
work_keys_str_mv	AT wishnuhardi clusteringtopicgroupsofdocumentsusingkmeansalgorithmaustralianembassyjakartamediareleases20062016 AT wisnuanantakusuma clusteringtopicgroupsofdocumentsusingkmeansalgorithmaustralianembassyjakartamediareleases20062016 AT sulistyobasuki clusteringtopicgroupsofdocumentsusingkmeansalgorithmaustralianembassyjakartamediareleases20062016

Clustering topic groups of documents using K-Means algorithm: Australian Embassy Jakarta media releases 2006-2016

Similar Items