Construction of cardiovascular information extraction corpus based on electronic medical records
Cardiovascular disease has a significant impact on both society and patients, making it necessary to conduct knowledge-based research such as research that utilizes knowledge graphs and automated question answering. However, the existing research on corpus construction for cardiovascular disease is...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2023-06-01
|
Series: | Mathematical Biosciences and Engineering |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/mbe.2023596?viewType=HTML |
_version_ | 1827915200245792768 |
---|---|
author | Hongyang Chang Hongying Zan Shuai Zhang Bingfei Zhao Kunli Zhang |
author_facet | Hongyang Chang Hongying Zan Shuai Zhang Bingfei Zhao Kunli Zhang |
author_sort | Hongyang Chang |
collection | DOAJ |
description | Cardiovascular disease has a significant impact on both society and patients, making it necessary to conduct knowledge-based research such as research that utilizes knowledge graphs and automated question answering. However, the existing research on corpus construction for cardiovascular disease is relatively limited, which has hindered further knowledge-based research on this disease. Electronic medical records contain patient data that span the entire diagnosis and treatment process and include a large amount of reliable medical information. Therefore, we collected electronic medical record data related to cardiovascular disease, combined the data with relevant work experience and developed a standard for labeling cardiovascular electronic medical record entities and entity relations. By building a sentence-level labeling result dictionary through the use of a rule-based semi-automatic method, a cardiovascular electronic medical record entity and entity relationship labeling corpus (CVDEMRC) was constructed. The CVDEMRC contains 7691 entities and 11,185 entity relation triples, and the results of consistency examination were 93.51% and 84.02% for entities and entity-relationship annotations, respectively, demonstrating good consistency results. The CVDEMRC constructed in this study is expected to provide a database for information extraction research related to cardiovascular diseases. |
first_indexed | 2024-03-13T02:55:07Z |
format | Article |
id | doaj.art-f9a7e1928e6644da938206ca9561d6d2 |
institution | Directory Open Access Journal |
issn | 1551-0018 |
language | English |
last_indexed | 2024-03-13T02:55:07Z |
publishDate | 2023-06-01 |
publisher | AIMS Press |
record_format | Article |
series | Mathematical Biosciences and Engineering |
spelling | doaj.art-f9a7e1928e6644da938206ca9561d6d22023-06-28T06:41:36ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-06-01207133791339710.3934/mbe.2023596Construction of cardiovascular information extraction corpus based on electronic medical recordsHongyang Chang0Hongying Zan1Shuai Zhang2Bingfei Zhao 3Kunli Zhang41. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China 2. Peng Cheng Laboratory, Shenzhen, China1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China 2. Peng Cheng Laboratory, Shenzhen, ChinaCardiovascular disease has a significant impact on both society and patients, making it necessary to conduct knowledge-based research such as research that utilizes knowledge graphs and automated question answering. However, the existing research on corpus construction for cardiovascular disease is relatively limited, which has hindered further knowledge-based research on this disease. Electronic medical records contain patient data that span the entire diagnosis and treatment process and include a large amount of reliable medical information. Therefore, we collected electronic medical record data related to cardiovascular disease, combined the data with relevant work experience and developed a standard for labeling cardiovascular electronic medical record entities and entity relations. By building a sentence-level labeling result dictionary through the use of a rule-based semi-automatic method, a cardiovascular electronic medical record entity and entity relationship labeling corpus (CVDEMRC) was constructed. The CVDEMRC contains 7691 entities and 11,185 entity relation triples, and the results of consistency examination were 93.51% and 84.02% for entities and entity-relationship annotations, respectively, demonstrating good consistency results. The CVDEMRC constructed in this study is expected to provide a database for information extraction research related to cardiovascular diseases.https://www.aimspress.com/article/doi/10.3934/mbe.2023596?viewType=HTMLcardiovascular diseasecorpus constructionelectronic medical record |
spellingShingle | Hongyang Chang Hongying Zan Shuai Zhang Bingfei Zhao Kunli Zhang Construction of cardiovascular information extraction corpus based on electronic medical records Mathematical Biosciences and Engineering cardiovascular disease corpus construction electronic medical record |
title | Construction of cardiovascular information extraction corpus based on electronic medical records |
title_full | Construction of cardiovascular information extraction corpus based on electronic medical records |
title_fullStr | Construction of cardiovascular information extraction corpus based on electronic medical records |
title_full_unstemmed | Construction of cardiovascular information extraction corpus based on electronic medical records |
title_short | Construction of cardiovascular information extraction corpus based on electronic medical records |
title_sort | construction of cardiovascular information extraction corpus based on electronic medical records |
topic | cardiovascular disease corpus construction electronic medical record |
url | https://www.aimspress.com/article/doi/10.3934/mbe.2023596?viewType=HTML |
work_keys_str_mv | AT hongyangchang constructionofcardiovascularinformationextractioncorpusbasedonelectronicmedicalrecords AT hongyingzan constructionofcardiovascularinformationextractioncorpusbasedonelectronicmedicalrecords AT shuaizhang constructionofcardiovascularinformationextractioncorpusbasedonelectronicmedicalrecords AT bingfeizhao constructionofcardiovascularinformationextractioncorpusbasedonelectronicmedicalrecords AT kunlizhang constructionofcardiovascularinformationextractioncorpusbasedonelectronicmedicalrecords |