Scenario-based insider threat detection from cyber activities

An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it,...

Full description

Bibliographic Details
Main Authors:	Chattopadhyay, Pratik, Wang, Lipo, Tan, Yap-Peng
Other Authors:	School of Electrical and Electronic Engineering
Format:	Journal Article
Language:	English
Published:	2020
Subjects:	Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data
Online Access:	https://hdl.handle.net/10356/140631

_version_	1826128908946243584
author	Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng
author_sort	Chattopadhyay, Pratik
collection	NTU
description	An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers.
first_indexed	2024-10-01T07:32:34Z
format	Journal Article
id	ntu-10356/140631
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T07:32:34Z
publishDate	2020
record_format	dspace
spelling	ntu-10356/1406312020-06-01T02:51:37Z Scenario-based insider threat detection from cyber activities Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers. 2020-06-01T02:51:37Z 2020-06-01T02:51:37Z 2018 Journal Article Chattopadhyay, P., Wang, L., & Tan, Y.-P. (2018). Scenario-based insider threat detection from cyber activities. IEEE Transactions on Computational Social Systems, 5(3), 660-675. doi:10.1109/tcss.2018.2857473 2329-924X https://hdl.handle.net/10356/140631 10.1109/TCSS.2018.2857473 2-s2.0-85052700825 3 5 660 675 en IEEE Transactions on Computational Social Systems © 2018 IEEE. All rights reserved.
spellingShingle	Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng Scenario-based insider threat detection from cyber activities
title	Scenario-based insider threat detection from cyber activities
title_full	Scenario-based insider threat detection from cyber activities
title_fullStr	Scenario-based insider threat detection from cyber activities
title_full_unstemmed	Scenario-based insider threat detection from cyber activities
title_short	Scenario-based insider threat detection from cyber activities
title_sort	scenario based insider threat detection from cyber activities
topic	Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data
url	https://hdl.handle.net/10356/140631
work_keys_str_mv	AT chattopadhyaypratik scenariobasedinsiderthreatdetectionfromcyberactivities AT wanglipo scenariobasedinsiderthreatdetectionfromcyberactivities AT tanyappeng scenariobasedinsiderthreatdetectionfromcyberactivities

Scenario-based insider threat detection from cyber activities

Similar Items