DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models

Recent statistics and studies show that the loss generated by insider threats is much higher than that generated by external attacks. More and more organizations are investing in or purchasing insider threat detection systems to prevent insider risks. However, the accurate and timely detection of in...

Full description

Bibliographic Details
Main Authors:	Zhi Qiang Wang, Abdulmotaleb El Saddik
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Digital twin cybersecurity insider threat deep learning transformer BERT
Online Access:	https://ieeexplore.ieee.org/document/10285086/

_version_	1797655675986247680
author	Zhi Qiang Wang Abdulmotaleb El Saddik
author_facet	Zhi Qiang Wang Abdulmotaleb El Saddik
author_sort	Zhi Qiang Wang
collection	DOAJ
description	Recent statistics and studies show that the loss generated by insider threats is much higher than that generated by external attacks. More and more organizations are investing in or purchasing insider threat detection systems to prevent insider risks. However, the accurate and timely detection of insider threats faces significant challenges. In this study, we proposed an intelligent insider threat detection framework based on Digital Twins and self-attentions based deep learning models. First, this paper introduces insider threats and the challenges in detecting them. Then this paper presents recent related works on solving insider threat detection problems and their limitations. Next, we propose our solutions to address these challenges: building an innovative intelligent insider threat detection framework based on Digital Twin (DT) and self-attention based deep learning models, performing insight analysis of users’ behavior and entities, adopting contextual word embedding techniques using Bidirectional Encoder Representations from Transformers (BERT) model and sentence embedding technique using Generative Pre-trained Transformer 2 (GPT-2) model to perform data augmentation to overcome significant data imbalance, and adopting temporal semantic representation of users’ behaviors to build user behavior time sequences. Subsequently, this study built self-attention-based deep learning models to quickly detect insider threats. This study proposes a simplified transformer model named DistilledTrans and applies the original transformer model, DistilledTrans, BERT + final layer, Robustly Optimized BERT Approach (RoBERTa) + final layer, and a hybrid method combining pre-trained (BERT, RoBERTa) with a Convolutional Neural Network (CNN) or Long Short-term Memory (LSTM) network model to detect insider threats. Finally, this paper presents experimental results on a dense dataset CERT r4.2 and augmented sporadic dataset CERT r6.2, evaluates their performance, and performs a comparison analysis with state-of-the-art models. Promising experimental results show that 1) contextual word embedding insert and substitution predicted by the BERT model, and context embedding sentences predicted by the GPT-2 model are effective data augmentation approaches to address high data imbalance; 2) DistilledTrans trained with sporadic dataset CERT r6.2 augmented by the contextual embedding sentence method predicted by GPT-2, outperforms the state-of-the-art models in terms of all evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the ROC Curve (AUC). Additionally, its structure is much simpler, and thus training time and computing cost are much less than those of recent models; 3) when trained with the dense dataset CERT r4.2, pre-trained models BERT plus a final layer or RoBERTa plus a final layer can achieve significantly higher performance than the current models with a very little sacrifice of precision. However, complex hybrid methods may not be required.
first_indexed	2024-03-11T17:18:02Z
format	Article
id	doaj.art-dc8a34afcbb141f4ab3cd95e47a0dea4
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-11T17:18:02Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-dc8a34afcbb141f4ab3cd95e47a0dea42023-10-19T23:00:46ZengIEEEIEEE Access2169-35362023-01-011111401311403010.1109/ACCESS.2023.332437110285086DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning ModelsZhi Qiang Wang0https://orcid.org/0009-0007-6547-6076Abdulmotaleb El Saddik1https://orcid.org/0000-0002-7690-8547Multimedia Communications Research Laboratory (MCRLab), School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, CanadaMultimedia Communications Research Laboratory (MCRLab), School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, CanadaRecent statistics and studies show that the loss generated by insider threats is much higher than that generated by external attacks. More and more organizations are investing in or purchasing insider threat detection systems to prevent insider risks. However, the accurate and timely detection of insider threats faces significant challenges. In this study, we proposed an intelligent insider threat detection framework based on Digital Twins and self-attentions based deep learning models. First, this paper introduces insider threats and the challenges in detecting them. Then this paper presents recent related works on solving insider threat detection problems and their limitations. Next, we propose our solutions to address these challenges: building an innovative intelligent insider threat detection framework based on Digital Twin (DT) and self-attention based deep learning models, performing insight analysis of users’ behavior and entities, adopting contextual word embedding techniques using Bidirectional Encoder Representations from Transformers (BERT) model and sentence embedding technique using Generative Pre-trained Transformer 2 (GPT-2) model to perform data augmentation to overcome significant data imbalance, and adopting temporal semantic representation of users’ behaviors to build user behavior time sequences. Subsequently, this study built self-attention-based deep learning models to quickly detect insider threats. This study proposes a simplified transformer model named DistilledTrans and applies the original transformer model, DistilledTrans, BERT + final layer, Robustly Optimized BERT Approach (RoBERTa) + final layer, and a hybrid method combining pre-trained (BERT, RoBERTa) with a Convolutional Neural Network (CNN) or Long Short-term Memory (LSTM) network model to detect insider threats. Finally, this paper presents experimental results on a dense dataset CERT r4.2 and augmented sporadic dataset CERT r6.2, evaluates their performance, and performs a comparison analysis with state-of-the-art models. Promising experimental results show that 1) contextual word embedding insert and substitution predicted by the BERT model, and context embedding sentences predicted by the GPT-2 model are effective data augmentation approaches to address high data imbalance; 2) DistilledTrans trained with sporadic dataset CERT r6.2 augmented by the contextual embedding sentence method predicted by GPT-2, outperforms the state-of-the-art models in terms of all evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the ROC Curve (AUC). Additionally, its structure is much simpler, and thus training time and computing cost are much less than those of recent models; 3) when trained with the dense dataset CERT r4.2, pre-trained models BERT plus a final layer or RoBERTa plus a final layer can achieve significantly higher performance than the current models with a very little sacrifice of precision. However, complex hybrid methods may not be required.https://ieeexplore.ieee.org/document/10285086/Digital twincybersecurityinsider threatdeep learningtransformerBERT
spellingShingle	Zhi Qiang Wang Abdulmotaleb El Saddik DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models IEEE Access Digital twin cybersecurity insider threat deep learning transformer BERT
title	DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models
title_full	DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models
title_fullStr	DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models
title_full_unstemmed	DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models
title_short	DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models
title_sort	dtitd an intelligent insider threat detection framework based on digital twin and self attention based deep learning models
topic	Digital twin cybersecurity insider threat deep learning transformer BERT
url	https://ieeexplore.ieee.org/document/10285086/
work_keys_str_mv	AT zhiqiangwang dtitdanintelligentinsiderthreatdetectionframeworkbasedondigitaltwinandselfattentionbaseddeeplearningmodels AT abdulmotalebelsaddik dtitdanintelligentinsiderthreatdetectionframeworkbasedondigitaltwinandselfattentionbaseddeeplearningmodels

DTITD: An Intelligent Insider Threat Detection Framework Based on Digital Twin and Self-Attention Based Deep Learning Models

Similar Items