Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor

News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to...

Full description

Bibliographic Details
Main Author: Nur Ghaniaviyanto Ramadhan
Format: Article
Language:English
Published: Ikatan Ahli Informatika Indonesia 2021-12-01
Series:Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Subjects:
Online Access:http://jurnal.iaii.or.id/index.php/RESTI/article/view/3547
Description
Summary:News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to read. These problems can be solved by grouping or so-called classification. The classification process is carried out of course by using a computerized process. This study aims to classify several news topics in Indonesian language using the KNN classification model and word2vec to convert words into vectors which aim to facilitate the classification process. The use of KNN in this study also determines the optimal K value to be used. In addition to using the classification model, this study also uses a word embedding-based model, namely word2vec. The results obtained using the word2vec and KNN models have an accuracy of 89.2% with a value of K=7. The word2vec and KNN models are also superior to the support vector machine, logistic regression, and random forest classification models.
ISSN:2580-0760