Scenario-based insider threat detection from cyber activities
An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it,...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/140631 |
_version_ | 1826128908946243584 |
---|---|
author | Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng |
author2 | School of Electrical and Electronic Engineering |
author_facet | School of Electrical and Electronic Engineering Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng |
author_sort | Chattopadhyay, Pratik |
collection | NTU |
description | An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers. |
first_indexed | 2024-10-01T07:32:34Z |
format | Journal Article |
id | ntu-10356/140631 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T07:32:34Z |
publishDate | 2020 |
record_format | dspace |
spelling | ntu-10356/1406312020-06-01T02:51:37Z Scenario-based insider threat detection from cyber activities Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data An insider threat scenario refers to the outcome of a set of malicious activities caused by intentional or unintentional misuse of the organization's systems, networks, data, and resources. Prevention of insider threat is difficult, since trusted partners of the organization are involved in it, who have authorized access to these confidential/sensitive resources. The state-of-the-art research on insider threat detection mostly focuses on developing unsupervised behavioral anomaly detection techniques with the objective of finding out anomalousness or abnormal changes in user behavior over time. However, an anomalous activity is not necessarily malicious that can lead to an insider threat scenario. As an improvement to the existing approaches, we propose a technique for insider threat detection from time-series classification of user activities. Initially, a set of single-day features is computed from the user activity logs. A time-series feature vector is next constructed from the statistics of each single-day feature over a period of time. The label of each time-series feature vector (whether malicious or nonmalicious) is extracted from the ground truth. To classify the imbalanced ground-truth insider threat data consisting of only a small number of malicious instances, we employ a cost-sensitive data adjustment technique that undersamples the nonmalicious class instances randomly. As a classifier, we employ a two-layered deep autoencoder neural network and compare its performance with other popularly used classifiers: Random forest and multilayer perceptron. Encouraging results are obtained by evaluating our approach using the CMU Insider Threat Data, which is the only publicly available insider threat data set consisting of about 14-GB web-browsing logs, along with logon, device connection, file transfer, and e-mail log files. We observe that both deep autoencoder and random forest classifiers classify the data-adjusted time-series feature set with high precision, recall, and f-score. Although multilayer perceptron has a high recall, it suffers from a lower precision and f-score compared to the other two classifiers. 2020-06-01T02:51:37Z 2020-06-01T02:51:37Z 2018 Journal Article Chattopadhyay, P., Wang, L., & Tan, Y.-P. (2018). Scenario-based insider threat detection from cyber activities. IEEE Transactions on Computational Social Systems, 5(3), 660-675. doi:10.1109/tcss.2018.2857473 2329-924X https://hdl.handle.net/10356/140631 10.1109/TCSS.2018.2857473 2-s2.0-85052700825 3 5 660 675 en IEEE Transactions on Computational Social Systems © 2018 IEEE. All rights reserved. |
spellingShingle | Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data Chattopadhyay, Pratik Wang, Lipo Tan, Yap-Peng Scenario-based insider threat detection from cyber activities |
title | Scenario-based insider threat detection from cyber activities |
title_full | Scenario-based insider threat detection from cyber activities |
title_fullStr | Scenario-based insider threat detection from cyber activities |
title_full_unstemmed | Scenario-based insider threat detection from cyber activities |
title_short | Scenario-based insider threat detection from cyber activities |
title_sort | scenario based insider threat detection from cyber activities |
topic | Engineering::Electrical and electronic engineering Cost-sensitive Learning Imbalanced Data |
url | https://hdl.handle.net/10356/140631 |
work_keys_str_mv | AT chattopadhyaypratik scenariobasedinsiderthreatdetectionfromcyberactivities AT wanglipo scenariobasedinsiderthreatdetectionfromcyberactivities AT tanyappeng scenariobasedinsiderthreatdetectionfromcyberactivities |