A Method for Anomaly Detection in Big Data based on Support Vector Machine

In recent years, data mining has played an essential role in computer system performance, helping to improve system functionality. One of the most critical and influential data mining algorithms is anomaly detection. Anomaly detection is a process in detecting system abnormality that helps with find...

Full description

Bibliographic Details
Main Authors:	Masoud Harimi, Mohammad Javad Shayegan
Format:	Article
Language:	English
Published:	Iran Telecom Research Center 2019-09-01
Series:	International Journal of Information and Communication Technology Research
Subjects:	detection support vector machine big data improvement of anomaly detection one-class support vector machine mutual information
Online Access:	http://ijict.itrc.ac.ir/article-1-442-en.html

_version_	1811169235300253696
author	Masoud Harimi Mohammad Javad Shayegan
author_facet	Masoud Harimi Mohammad Javad Shayegan
author_sort	Masoud Harimi
collection	DOAJ
description	In recent years, data mining has played an essential role in computer system performance, helping to improve system functionality. One of the most critical and influential data mining algorithms is anomaly detection. Anomaly detection is a process in detecting system abnormality that helps with finding system problems and troubleshooting. Intrusion and fraud detection services used by credit card companies are some examples of anomaly detection in the real world. According to the increasing volumes of the datasets that creates big data, traditional data mining approaches do not have efficient enough results. Various platforms, frameworks, and algorithms for big data mining have been presented to account for this deficiency. For instance, Hadoop and Spark are some of the most used frameworks in this field. Support Vector Machine (SVM) is one of the most popular approaches in anomaly detection, which—according to its distributed and parallel extensions—is widely used in big data mining. In this research, Mutual Information is used for feature selection. Besides, the kernel function of the one-class support vector machine has been improved; thus, the performance of the anomaly detection improved. This approach is implemented using Spark. The NSL-KDD dataset is used, and an accuracy of more than 80 percent is achieved. Compared to the other similar approaches in anomaly detection, the results are improved.
first_indexed	2024-04-10T16:40:10Z
format	Article
id	doaj.art-90066862c65b4c3dbe7820aea81f37af
institution	Directory Open Access Journal
issn	2251-6107 2783-4425
language	English
last_indexed	2024-04-10T16:40:10Z
publishDate	2019-09-01
publisher	Iran Telecom Research Center
record_format	Article
series	International Journal of Information and Communication Technology Research
spelling	doaj.art-90066862c65b4c3dbe7820aea81f37af2023-02-08T07:57:52ZengIran Telecom Research CenterInternational Journal of Information and Communication Technology Research2251-61072783-44252019-09-011134248A Method for Anomaly Detection in Big Data based on Support Vector MachineMasoud Harimi0Mohammad Javad Shayegan1 Department of Computer Engineering University of Science and Culture epartment of Computer Engineering University of Science and Culture In recent years, data mining has played an essential role in computer system performance, helping to improve system functionality. One of the most critical and influential data mining algorithms is anomaly detection. Anomaly detection is a process in detecting system abnormality that helps with finding system problems and troubleshooting. Intrusion and fraud detection services used by credit card companies are some examples of anomaly detection in the real world. According to the increasing volumes of the datasets that creates big data, traditional data mining approaches do not have efficient enough results. Various platforms, frameworks, and algorithms for big data mining have been presented to account for this deficiency. For instance, Hadoop and Spark are some of the most used frameworks in this field. Support Vector Machine (SVM) is one of the most popular approaches in anomaly detection, which—according to its distributed and parallel extensions—is widely used in big data mining. In this research, Mutual Information is used for feature selection. Besides, the kernel function of the one-class support vector machine has been improved; thus, the performance of the anomaly detection improved. This approach is implemented using Spark. The NSL-KDD dataset is used, and an accuracy of more than 80 percent is achieved. Compared to the other similar approaches in anomaly detection, the results are improved.http://ijict.itrc.ac.ir/article-1-442-en.htmldetectionsupport vector machinebig dataimprovement of anomaly detectionone-class support vector machinemutual information
spellingShingle	Masoud Harimi Mohammad Javad Shayegan A Method for Anomaly Detection in Big Data based on Support Vector Machine International Journal of Information and Communication Technology Research detection support vector machine big data improvement of anomaly detection one-class support vector machine mutual information
title	A Method for Anomaly Detection in Big Data based on Support Vector Machine
title_full	A Method for Anomaly Detection in Big Data based on Support Vector Machine
title_fullStr	A Method for Anomaly Detection in Big Data based on Support Vector Machine
title_full_unstemmed	A Method for Anomaly Detection in Big Data based on Support Vector Machine
title_short	A Method for Anomaly Detection in Big Data based on Support Vector Machine
title_sort	method for anomaly detection in big data based on support vector machine
topic	detection support vector machine big data improvement of anomaly detection one-class support vector machine mutual information
url	http://ijict.itrc.ac.ir/article-1-442-en.html
work_keys_str_mv	AT masoudharimi amethodforanomalydetectioninbigdatabasedonsupportvectormachine AT mohammadjavadshayegan amethodforanomalydetectioninbigdatabasedonsupportvectormachine AT masoudharimi methodforanomalydetectioninbigdatabasedonsupportvectormachine AT mohammadjavadshayegan methodforanomalydetectioninbigdatabasedonsupportvectormachine

A Method for Anomaly Detection in Big Data based on Support Vector Machine

Similar Items