Extraction and classification of risk-related sentences from securities reports

With the drastically changing business environment, it is difficult even for experts to properly extract and classify risk statements from securities reports, which contain large volumes and unstructured information. Several methods have been proposed, but the existing methods face difficulty in dea...

Full description

Bibliographic Details
Main Authors: Motomasa Fujii, Hiroki Sakaji, Shigeru Masuyama, Hajime Sasaki
Format: Article
Language:English
Published: Elsevier 2022-11-01
Series:International Journal of Information Management Data Insights
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667096822000398
_version_ 1811321358442823680
author Motomasa Fujii
Hiroki Sakaji
Shigeru Masuyama
Hajime Sasaki
author_facet Motomasa Fujii
Hiroki Sakaji
Shigeru Masuyama
Hajime Sasaki
author_sort Motomasa Fujii
collection DOAJ
description With the drastically changing business environment, it is difficult even for experts to properly extract and classify risk statements from securities reports, which contain large volumes and unstructured information. Several methods have been proposed, but the existing methods face difficulty in dealing with flexible risk expressions. This study presents an open-domain risk-analysis framework that combine the strengths of both humans and machines. They include defining appropriate business risks and constructing supervised data based on those definitions. Risks were then extracted and classified from the securities reports of a representative group of Japanese companies. We confirmed the limitations of pattern matching, and the usefulness of contextual analysis methods. We also confirmed the importance of constructing supervised data based on appropriate guidelines of data classification. This study presents a framework that quickly and effectively derives the risk structure of a given industry or company from vast and unstructured information.
first_indexed 2024-04-13T13:16:15Z
format Article
id doaj.art-7e65fbbeba804fe1a83a9bebbfe3c4ad
institution Directory Open Access Journal
issn 2667-0968
language English
last_indexed 2024-04-13T13:16:15Z
publishDate 2022-11-01
publisher Elsevier
record_format Article
series International Journal of Information Management Data Insights
spelling doaj.art-7e65fbbeba804fe1a83a9bebbfe3c4ad2022-12-22T02:45:28ZengElsevierInternational Journal of Information Management Data Insights2667-09682022-11-0122100096Extraction and classification of risk-related sentences from securities reportsMotomasa Fujii0Hiroki Sakaji1Shigeru Masuyama2Hajime Sasaki3Graduate School of Management, Tokyo University of Science, Tokyo, JapanGraduate School of Engineering, The University of Tokyo, Tokyo, JapanGraduate School of Management, Tokyo University of Science, Tokyo, JapanInstitute for Future Initiatives, The University of Tokyo, Tokyo, Japan; Corresponding author.With the drastically changing business environment, it is difficult even for experts to properly extract and classify risk statements from securities reports, which contain large volumes and unstructured information. Several methods have been proposed, but the existing methods face difficulty in dealing with flexible risk expressions. This study presents an open-domain risk-analysis framework that combine the strengths of both humans and machines. They include defining appropriate business risks and constructing supervised data based on those definitions. Risks were then extracted and classified from the securities reports of a representative group of Japanese companies. We confirmed the limitations of pattern matching, and the usefulness of contextual analysis methods. We also confirmed the importance of constructing supervised data based on appropriate guidelines of data classification. This study presents a framework that quickly and effectively derives the risk structure of a given industry or company from vast and unstructured information.http://www.sciencedirect.com/science/article/pii/S2667096822000398Risk analysisFinancial text miningRisk extractionRisk classification
spellingShingle Motomasa Fujii
Hiroki Sakaji
Shigeru Masuyama
Hajime Sasaki
Extraction and classification of risk-related sentences from securities reports
International Journal of Information Management Data Insights
Risk analysis
Financial text mining
Risk extraction
Risk classification
title Extraction and classification of risk-related sentences from securities reports
title_full Extraction and classification of risk-related sentences from securities reports
title_fullStr Extraction and classification of risk-related sentences from securities reports
title_full_unstemmed Extraction and classification of risk-related sentences from securities reports
title_short Extraction and classification of risk-related sentences from securities reports
title_sort extraction and classification of risk related sentences from securities reports
topic Risk analysis
Financial text mining
Risk extraction
Risk classification
url http://www.sciencedirect.com/science/article/pii/S2667096822000398
work_keys_str_mv AT motomasafujii extractionandclassificationofriskrelatedsentencesfromsecuritiesreports
AT hirokisakaji extractionandclassificationofriskrelatedsentencesfromsecuritiesreports
AT shigerumasuyama extractionandclassificationofriskrelatedsentencesfromsecuritiesreports
AT hajimesasaki extractionandclassificationofriskrelatedsentencesfromsecuritiesreports