An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion

With the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the ad...

Full description

Bibliographic Details
Main Authors: Tianfang Sun, Pin Yang, Mengming Li, Shan Liao
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Future Internet
Subjects:
Online Access:https://www.mdpi.com/1999-5903/13/2/40
_version_ 1797416381046587392
author Tianfang Sun
Pin Yang
Mengming Li
Shan Liao
author_facet Tianfang Sun
Pin Yang
Mengming Li
Shan Liao
author_sort Tianfang Sun
collection DOAJ
description With the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the advanced persistent threats. However, due to the explosive growth of information shared on multi-type OSTIPs, manually collecting the CTI has had low efficiency. Articles published on the OSTIPs are unstructured, leading to an imperative challenge to automatically gather CTI records only through natural language processing (NLP) methods. To remedy these limitations, this paper proposes an automatic approach to generate the CTI records based on multi-type OSTIPs (GCO), combing the NLP method, machine learning method, and cybersecurity threat intelligence knowledge. The experiment results demonstrate that the proposed GCO outperformed some state-of-the-art approaches on article classification and cybersecurity intelligence details (CSIs) extraction, with accuracy, precision, and recall all over 93%; finally, the generated records in the Neo4j-based CTI database can help reveal malicious threat groups.
first_indexed 2024-03-09T06:02:24Z
format Article
id doaj.art-7fd9295be70f4979972e3912542cdf69
institution Directory Open Access Journal
issn 1999-5903
language English
last_indexed 2024-03-09T06:02:24Z
publishDate 2021-02-01
publisher MDPI AG
record_format Article
series Future Internet
spelling doaj.art-7fd9295be70f4979972e3912542cdf692023-12-03T12:07:38ZengMDPI AGFuture Internet1999-59032021-02-011324010.3390/fi13020040An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information FusionTianfang Sun0Pin Yang1Mengming Li2Shan Liao3College of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaWith the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the advanced persistent threats. However, due to the explosive growth of information shared on multi-type OSTIPs, manually collecting the CTI has had low efficiency. Articles published on the OSTIPs are unstructured, leading to an imperative challenge to automatically gather CTI records only through natural language processing (NLP) methods. To remedy these limitations, this paper proposes an automatic approach to generate the CTI records based on multi-type OSTIPs (GCO), combing the NLP method, machine learning method, and cybersecurity threat intelligence knowledge. The experiment results demonstrate that the proposed GCO outperformed some state-of-the-art approaches on article classification and cybersecurity intelligence details (CSIs) extraction, with accuracy, precision, and recall all over 93%; finally, the generated records in the Neo4j-based CTI database can help reveal malicious threat groups.https://www.mdpi.com/1999-5903/13/2/40cyber threat intelligenceopen-source threat intelligence platformnature language processingmachine learninginformation extractiontext analytics
spellingShingle Tianfang Sun
Pin Yang
Mengming Li
Shan Liao
An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
Future Internet
cyber threat intelligence
open-source threat intelligence platform
nature language processing
machine learning
information extraction
text analytics
title An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
title_full An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
title_fullStr An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
title_full_unstemmed An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
title_short An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
title_sort automatic generation approach of the cyber threat intelligence records based on multi source information fusion
topic cyber threat intelligence
open-source threat intelligence platform
nature language processing
machine learning
information extraction
text analytics
url https://www.mdpi.com/1999-5903/13/2/40
work_keys_str_mv AT tianfangsun anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT pinyang anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT mengmingli anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT shanliao anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT tianfangsun automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT pinyang automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT mengmingli automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion
AT shanliao automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion