An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion
With the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the ad...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-02-01
|
Series: | Future Internet |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-5903/13/2/40 |
_version_ | 1797416381046587392 |
---|---|
author | Tianfang Sun Pin Yang Mengming Li Shan Liao |
author_facet | Tianfang Sun Pin Yang Mengming Li Shan Liao |
author_sort | Tianfang Sun |
collection | DOAJ |
description | With the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the advanced persistent threats. However, due to the explosive growth of information shared on multi-type OSTIPs, manually collecting the CTI has had low efficiency. Articles published on the OSTIPs are unstructured, leading to an imperative challenge to automatically gather CTI records only through natural language processing (NLP) methods. To remedy these limitations, this paper proposes an automatic approach to generate the CTI records based on multi-type OSTIPs (GCO), combing the NLP method, machine learning method, and cybersecurity threat intelligence knowledge. The experiment results demonstrate that the proposed GCO outperformed some state-of-the-art approaches on article classification and cybersecurity intelligence details (CSIs) extraction, with accuracy, precision, and recall all over 93%; finally, the generated records in the Neo4j-based CTI database can help reveal malicious threat groups. |
first_indexed | 2024-03-09T06:02:24Z |
format | Article |
id | doaj.art-7fd9295be70f4979972e3912542cdf69 |
institution | Directory Open Access Journal |
issn | 1999-5903 |
language | English |
last_indexed | 2024-03-09T06:02:24Z |
publishDate | 2021-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Future Internet |
spelling | doaj.art-7fd9295be70f4979972e3912542cdf692023-12-03T12:07:38ZengMDPI AGFuture Internet1999-59032021-02-011324010.3390/fi13020040An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information FusionTianfang Sun0Pin Yang1Mengming Li2Shan Liao3College of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Cyber Science and Engineering, Sichuan University, Chengdu 610065, ChinaWith the progressive deterioration of cyber threats, collecting cyber threat intelligence (CTI) from open-source threat intelligence publishing platforms (OSTIPs) can help information security personnel grasp public opinions with specific pertinence, handle emergency events, and even confront the advanced persistent threats. However, due to the explosive growth of information shared on multi-type OSTIPs, manually collecting the CTI has had low efficiency. Articles published on the OSTIPs are unstructured, leading to an imperative challenge to automatically gather CTI records only through natural language processing (NLP) methods. To remedy these limitations, this paper proposes an automatic approach to generate the CTI records based on multi-type OSTIPs (GCO), combing the NLP method, machine learning method, and cybersecurity threat intelligence knowledge. The experiment results demonstrate that the proposed GCO outperformed some state-of-the-art approaches on article classification and cybersecurity intelligence details (CSIs) extraction, with accuracy, precision, and recall all over 93%; finally, the generated records in the Neo4j-based CTI database can help reveal malicious threat groups.https://www.mdpi.com/1999-5903/13/2/40cyber threat intelligenceopen-source threat intelligence platformnature language processingmachine learninginformation extractiontext analytics |
spellingShingle | Tianfang Sun Pin Yang Mengming Li Shan Liao An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion Future Internet cyber threat intelligence open-source threat intelligence platform nature language processing machine learning information extraction text analytics |
title | An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion |
title_full | An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion |
title_fullStr | An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion |
title_full_unstemmed | An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion |
title_short | An Automatic Generation Approach of the Cyber Threat Intelligence Records Based on Multi-Source Information Fusion |
title_sort | automatic generation approach of the cyber threat intelligence records based on multi source information fusion |
topic | cyber threat intelligence open-source threat intelligence platform nature language processing machine learning information extraction text analytics |
url | https://www.mdpi.com/1999-5903/13/2/40 |
work_keys_str_mv | AT tianfangsun anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT pinyang anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT mengmingli anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT shanliao anautomaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT tianfangsun automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT pinyang automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT mengmingli automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion AT shanliao automaticgenerationapproachofthecyberthreatintelligencerecordsbasedonmultisourceinformationfusion |