Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning

Background: ICD-10 has been widely used in statistical analysis of mortality rates and medical reimbursement. Automatic ICD-10 coding is desperately needed because manually assigning codes is expensive, time-consuming, and labor-intensive. Diagnoses described in medical records differ significantly...

Full description

Bibliographic Details
Main Authors:	Yani Chen, Han Chen, Xudong Lu, Huilong Duan, Shilin He, Jiye An
Format:	Article
Language:	English
Published:	Elsevier 2023-04-01
Series:	Heliyon
Subjects:	Automatic coding ICD-10 Semantic matching Analogical reasoning
Online Access:	http://www.sciencedirect.com/science/article/pii/S2405844023027779

_version_	1827957253502664704
author	Yani Chen Han Chen Xudong Lu Huilong Duan Shilin He Jiye An
author_facet	Yani Chen Han Chen Xudong Lu Huilong Duan Shilin He Jiye An
author_sort	Yani Chen
collection	DOAJ
description	Background: ICD-10 has been widely used in statistical analysis of mortality rates and medical reimbursement. Automatic ICD-10 coding is desperately needed because manually assigning codes is expensive, time-consuming, and labor-intensive. Diagnoses described in medical records differ significantly from those used in ICD-10 classification, making it impossible for existing automatic coding techniques to perform well enough to support medical billing, resource allocation, and research requirements. Meanwhile, most of the current automatic coding approaches are oriented toward English ICD-10. This method for automatically assigning ICD-10 codes to diagnoses extracted from Chinese discharge records was provided in this paper. Method: First, BERT creates word representations of the two texts. Second, the context representation layer incorporates contextual information into the representation of each time step of the word representations using a bidirectional Long Short-Term Memory. Third, the matching layer compares each contextual embedding of the uncoded diagnosis record against a weighted version of all contextual character embeddings of the manually coded diagnosis record. The matching strategy is element-wise subtraction and element-wise multiplication and then through a neural network layer. Fourth, the matching vectors are combined using a one-layer convolutional neural network. A sigmoid is then used to output matching results. Results: To evaluate the proposed method, 1,003,558 manually coded primary diagnoses were gathered from the homepage of the discharge medical records. The experimental results showed that the proposed method outperformed popular deep semantic matching algorithms, such as DSSM, ConvNet, ESIM, and ABCNN, and demonstrated state-of-the-art results in a single text matching with an accuracy of 0.986, a precision of 0.979, a recall of 0.983, and an F1-score of 0.981. Conclusion: The automatic ICD-10 coding of Chinese diagnoses is successful when using the proposed deep semantic matching approach based on analogical reasoning.
first_indexed	2024-04-09T15:17:26Z
format	Article
id	doaj.art-a8d06c268af543cc84b19e2df44fd66d
institution	Directory Open Access Journal
issn	2405-8440
language	English
last_indexed	2024-04-09T15:17:26Z
publishDate	2023-04-01
publisher	Elsevier
record_format	Article
series	Heliyon
spelling	doaj.art-a8d06c268af543cc84b19e2df44fd66d2023-04-29T14:57:08ZengElsevierHeliyon2405-84402023-04-0194e15570Automatic ICD-10 coding: Deep semantic matching based on analogical reasoningYani Chen0Han Chen1Xudong Lu2Huilong Duan3Shilin He4Jiye An5College of Biomedical Engineering and Instrument Science, Zhejiang University, Zheda Road, 310027 Hanghzou, Zhejiang Province, ChinaDepartment of Information, Hainan Hospital of Chinese PLA General Hospital, Haitang Bay, 572013 Sanya, Hainan Province, ChinaCollege of Biomedical Engineering and Instrument Science, Zhejiang University, Zheda Road, 310027 Hanghzou, Zhejiang Province, ChinaCollege of Biomedical Engineering and Instrument Science, Zhejiang University, Zheda Road, 310027 Hanghzou, Zhejiang Province, ChinaDepartment of Information, Hainan Hospital of Chinese PLA General Hospital, Haitang Bay, 572013 Sanya, Hainan Province, China; Corresponding author. Hainan Hospital of Chinese PLA General Hospital, Haitang Bay, 572013 Sanya, Hainan Province, China.College of Biomedical Engineering and Instrument Science, Zhejiang University, Zheda Road, 310027 Hanghzou, Zhejiang Province, China; Corresponding author. Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang Province, 310058, China.Background: ICD-10 has been widely used in statistical analysis of mortality rates and medical reimbursement. Automatic ICD-10 coding is desperately needed because manually assigning codes is expensive, time-consuming, and labor-intensive. Diagnoses described in medical records differ significantly from those used in ICD-10 classification, making it impossible for existing automatic coding techniques to perform well enough to support medical billing, resource allocation, and research requirements. Meanwhile, most of the current automatic coding approaches are oriented toward English ICD-10. This method for automatically assigning ICD-10 codes to diagnoses extracted from Chinese discharge records was provided in this paper. Method: First, BERT creates word representations of the two texts. Second, the context representation layer incorporates contextual information into the representation of each time step of the word representations using a bidirectional Long Short-Term Memory. Third, the matching layer compares each contextual embedding of the uncoded diagnosis record against a weighted version of all contextual character embeddings of the manually coded diagnosis record. The matching strategy is element-wise subtraction and element-wise multiplication and then through a neural network layer. Fourth, the matching vectors are combined using a one-layer convolutional neural network. A sigmoid is then used to output matching results. Results: To evaluate the proposed method, 1,003,558 manually coded primary diagnoses were gathered from the homepage of the discharge medical records. The experimental results showed that the proposed method outperformed popular deep semantic matching algorithms, such as DSSM, ConvNet, ESIM, and ABCNN, and demonstrated state-of-the-art results in a single text matching with an accuracy of 0.986, a precision of 0.979, a recall of 0.983, and an F1-score of 0.981. Conclusion: The automatic ICD-10 coding of Chinese diagnoses is successful when using the proposed deep semantic matching approach based on analogical reasoning.http://www.sciencedirect.com/science/article/pii/S2405844023027779Automatic codingICD-10Semantic matchingAnalogical reasoning
spellingShingle	Yani Chen Han Chen Xudong Lu Huilong Duan Shilin He Jiye An Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning Heliyon Automatic coding ICD-10 Semantic matching Analogical reasoning
title	Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning
title_full	Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning
title_fullStr	Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning
title_full_unstemmed	Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning
title_short	Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning
title_sort	automatic icd 10 coding deep semantic matching based on analogical reasoning
topic	Automatic coding ICD-10 Semantic matching Analogical reasoning
url	http://www.sciencedirect.com/science/article/pii/S2405844023027779
work_keys_str_mv	AT yanichen automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning AT hanchen automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning AT xudonglu automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning AT huilongduan automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning AT shilinhe automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning AT jiyean automaticicd10codingdeepsemanticmatchingbasedonanalogicalreasoning

Automatic ICD-10 coding: Deep semantic matching based on analogical reasoning

Similar Items