Investigation on effective solutions against insider attacks

One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensiona...

Full description

Bibliographic Details
Main Author: Ang, Jun Hao
Other Authors: Felicity Chan
Format: Final Year Project (FYP)
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74243
_version_ 1826130697938534400
author Ang, Jun Hao
author2 Felicity Chan
author_facet Felicity Chan
Ang, Jun Hao
author_sort Ang, Jun Hao
collection NTU
description One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensionality reduction techniques discussed in this report are feature selection methods i.e. Recursive Feature Elimination (RFE), Chi-Square Test and feature extraction methods i.e. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA). The machine learning algorithms discussed in this report are supervised method i.e. K-Nearest Neighbour (KNN) and unsupervised method i.e. K-Means Clustering (KMC). The dataset used is a labelled phishing website dataset with 10,000 rows and 30 features. In practical practices, accuracy of an insider threat detection is more essential than the high data storage demand but having accuracy improved and data storage demand reduced is a bonus. Therefore, in the experiments conducted for this report, the effectiveness of a dimensionality reduction technique is evaluated based on the maximum amount of data storage that can be reduced regardless of any amount of improvement in accuracy. Based on this kind of evaluation, the experimental results show that both feature selection methods RFE and Chi-Square Test in general did a good job on both KNN and KMC, but for feature extraction methods PCA did well only on KNN and LDA did exceptionally well only on KMC. From the results, it can be concluded that the performance of feature selection methods is more stable than feature extraction methods but the degree of improvements in terms of accuracy and data storage reduction by feature extraction methods are far more better than that by feature selection methods. One recommendation for future projects is to evaluate the effectiveness of previous mentioned dimensionality reduction techniques, in addition to Embedded feature selection method and other feature extraction methods, on supervised, unsupervised and reinforcement learning.
first_indexed 2024-10-01T08:00:11Z
format Final Year Project (FYP)
id ntu-10356/74243
institution Nanyang Technological University
language English
last_indexed 2024-10-01T08:00:11Z
publishDate 2018
record_format dspace
spelling ntu-10356/742432023-03-03T20:57:57Z Investigation on effective solutions against insider attacks Ang, Jun Hao Felicity Chan School of Computer Science and Engineering Li Fang DRNTU::Engineering One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensionality reduction techniques discussed in this report are feature selection methods i.e. Recursive Feature Elimination (RFE), Chi-Square Test and feature extraction methods i.e. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA). The machine learning algorithms discussed in this report are supervised method i.e. K-Nearest Neighbour (KNN) and unsupervised method i.e. K-Means Clustering (KMC). The dataset used is a labelled phishing website dataset with 10,000 rows and 30 features. In practical practices, accuracy of an insider threat detection is more essential than the high data storage demand but having accuracy improved and data storage demand reduced is a bonus. Therefore, in the experiments conducted for this report, the effectiveness of a dimensionality reduction technique is evaluated based on the maximum amount of data storage that can be reduced regardless of any amount of improvement in accuracy. Based on this kind of evaluation, the experimental results show that both feature selection methods RFE and Chi-Square Test in general did a good job on both KNN and KMC, but for feature extraction methods PCA did well only on KNN and LDA did exceptionally well only on KMC. From the results, it can be concluded that the performance of feature selection methods is more stable than feature extraction methods but the degree of improvements in terms of accuracy and data storage reduction by feature extraction methods are far more better than that by feature selection methods. One recommendation for future projects is to evaluate the effectiveness of previous mentioned dimensionality reduction techniques, in addition to Embedded feature selection method and other feature extraction methods, on supervised, unsupervised and reinforcement learning. Bachelor of Engineering (Computer Science) 2018-05-14T04:50:53Z 2018-05-14T04:50:53Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74243 en Nanyang Technological University 54 p. application/pdf
spellingShingle DRNTU::Engineering
Ang, Jun Hao
Investigation on effective solutions against insider attacks
title Investigation on effective solutions against insider attacks
title_full Investigation on effective solutions against insider attacks
title_fullStr Investigation on effective solutions against insider attacks
title_full_unstemmed Investigation on effective solutions against insider attacks
title_short Investigation on effective solutions against insider attacks
title_sort investigation on effective solutions against insider attacks
topic DRNTU::Engineering
url http://hdl.handle.net/10356/74243
work_keys_str_mv AT angjunhao investigationoneffectivesolutionsagainstinsiderattacks