Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm

Radical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SA...

Full description

Bibliographic Details
Main Authors: Muh Subhan, Amang Sudarsono, Ali Ridho Barakbah
Format: Article
Language:English
Published: Politeknik Elektronika Negeri Surabaya 2018-01-01
Series:Emitter: International Journal of Engineering Technology
Subjects:
Online Access:https://emitter.pens.ac.id/index.php/emitter/article/view/214
_version_ 1818886299934785536
author Muh Subhan
Amang Sudarsono
Ali Ridho Barakbah
author_facet Muh Subhan
Amang Sudarsono
Ali Ridho Barakbah
author_sort Muh Subhan
collection DOAJ
description Radical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SARA in Indonesian languange. SARA content is very difficult to detect due to the large number, unstructure system and many noise can be caused multiple interpretations. This problem can threat the unity and harmony of the religion. According to this condition, it is required a system that can distinguish the radical content or not. In this system, we propose text mining approach using DF threshold and Human Brain as the feature extraction. The system is divided into several steps, those are collecting data which is including at preprocessing part, text mining, selection features, classification for grouping the data with class label, simillarity calculation of data training, and visualization to the radical content or non radical content. The experimental result show that using combination from 10-cross validation and k-Nearest Neighbor (kNN) as the classification methods achieve 66.37% accuracy performance with 7 k value of kNN method[1].
first_indexed 2024-12-19T16:19:08Z
format Article
id doaj.art-3cc1d4fb71ac49a48bbf532613775393
institution Directory Open Access Journal
issn 2355-391X
2443-1168
language English
last_indexed 2024-12-19T16:19:08Z
publishDate 2018-01-01
publisher Politeknik Elektronika Negeri Surabaya
record_format Article
series Emitter: International Journal of Engineering Technology
spelling doaj.art-3cc1d4fb71ac49a48bbf5326137753932022-12-21T20:14:31ZengPoliteknik Elektronika Negeri SurabayaEmitter: International Journal of Engineering Technology2355-391X2443-11682018-01-015210.24003/emitter.v5i2.21493Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor AlgorithmMuh Subhan0Amang Sudarsono1Ali Ridho Barakbah2Electronics Engineering Polytechnic Institute of SurabayaElectronics Engineering Polytechnic Institute of SurabayaElectronics Engineering Polytechnic Institute of SurabayaRadical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SARA in Indonesian languange. SARA content is very difficult to detect due to the large number, unstructure system and many noise can be caused multiple interpretations. This problem can threat the unity and harmony of the religion. According to this condition, it is required a system that can distinguish the radical content or not. In this system, we propose text mining approach using DF threshold and Human Brain as the feature extraction. The system is divided into several steps, those are collecting data which is including at preprocessing part, text mining, selection features, classification for grouping the data with class label, simillarity calculation of data training, and visualization to the radical content or non radical content. The experimental result show that using combination from 10-cross validation and k-Nearest Neighbor (kNN) as the classification methods achieve 66.37% accuracy performance with 7 k value of kNN method[1].https://emitter.pens.ac.id/index.php/emitter/article/view/214K-NNNearest NeighbourRadical ContentIndonesia
spellingShingle Muh Subhan
Amang Sudarsono
Ali Ridho Barakbah
Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
Emitter: International Journal of Engineering Technology
K-NN
Nearest Neighbour
Radical Content
Indonesia
title Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
title_full Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
title_fullStr Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
title_full_unstemmed Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
title_short Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm
title_sort classification of radical web content in indonesia using web content mining and k nearest neighbor algorithm
topic K-NN
Nearest Neighbour
Radical Content
Indonesia
url https://emitter.pens.ac.id/index.php/emitter/article/view/214
work_keys_str_mv AT muhsubhan classificationofradicalwebcontentinindonesiausingwebcontentminingandknearestneighboralgorithm
AT amangsudarsono classificationofradicalwebcontentinindonesiausingwebcontentminingandknearestneighboralgorithm
AT aliridhobarakbah classificationofradicalwebcontentinindonesiausingwebcontentminingandknearestneighboralgorithm