Scenario-based insider threat detection from cyber activities

An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it,...

Full description

Bibliographic Details
Main Authors: Chattopadhyay, Pratik, Wang, Lipo, Tan, Yap-Peng
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140631
_version_ 1826128908946243584
author Chattopadhyay, Pratik
Wang, Lipo
Tan, Yap-Peng
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Chattopadhyay, Pratik
Wang, Lipo
Tan, Yap-Peng
author_sort Chattopadhyay, Pratik
collection NTU
description An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers.
first_indexed 2024-10-01T07:32:34Z
format Journal Article
id ntu-10356/140631
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:32:34Z
publishDate 2020
record_format dspace
spelling ntu-10356/1406312020-06-01T02:51:37Z Scenario-based insider threat detection from cyber activities Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers. 2020-06-01T02:51:37Z 2020-06-01T02:51:37Z 2018 Journal Article Chattopadhyay, P., Wang, L., & Tan, Y.-P. (2018). Scenario-based insider threat detection from cyber activities. IEEE Transactions on Computational Social Systems, 5(3), 660-675. doi:10.1109/tcss.2018.2857473 2329-924X https://hdl.handle.net/10356/140631 10.1109/TCSS.2018.2857473 2-s2.0-85052700825 3 5 660 675 en IEEE Transactions on Computational Social Systems © 2018 IEEE. All rights reserved.
spellingShingle Engineering::Electrical and electronic engineering
Cost-sensitive Learning
Imbalanced Data
Chattopadhyay, Pratik
Wang, Lipo
Tan, Yap-Peng
Scenario-based insider threat detection from cyber activities
title Scenario-based insider threat detection from cyber activities
title_full Scenario-based insider threat detection from cyber activities
title_fullStr Scenario-based insider threat detection from cyber activities
title_full_unstemmed Scenario-based insider threat detection from cyber activities
title_short Scenario-based insider threat detection from cyber activities
title_sort scenario based insider threat detection from cyber activities
topic Engineering::Electrical and electronic engineering
Cost-sensitive Learning
Imbalanced Data
url https://hdl.handle.net/10356/140631
work_keys_str_mv AT chattopadhyaypratik scenariobasedinsiderthreatdetectionfromcyberactivities
AT wanglipo scenariobasedinsiderthreatdetectionfromcyberactivities
AT tanyappeng scenariobasedinsiderthreatdetectionfromcyberactivities