Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting
Cancer registries are critical databases for cancer research whose maintenance requires various types of domain knowledge with labor-intensive data curation. In order to facilitate the curation process with high quality in a timely manner, we developed a hybrid neural symbolic system for cancer regi...
Main Authors: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9493200/ |
_version_ | 1818915887134015488 |
---|---|
author | Hong-Jie Dai Yi-Hsin Yang Ti-Hao Wang Yan-Jie Lin Pin-Jou Lu Chung-Yang Wu Yu-Cheng Chang You-Qian Lee You-Chen Zhang Yuan-Chi Hsu Han-Hsiang Wu Cheng-Rong Ke Chih-Jen Huang Yu-Tsang Wang Sheau-Fang Yang Kuan-Chung Hsiao Ko-Jiunn Liu Li-Tzong Chen I-Shou Chang K. S. Clifford Chao Tsang-Wu Liu |
author_facet | Hong-Jie Dai Yi-Hsin Yang Ti-Hao Wang Yan-Jie Lin Pin-Jou Lu Chung-Yang Wu Yu-Cheng Chang You-Qian Lee You-Chen Zhang Yuan-Chi Hsu Han-Hsiang Wu Cheng-Rong Ke Chih-Jen Huang Yu-Tsang Wang Sheau-Fang Yang Kuan-Chung Hsiao Ko-Jiunn Liu Li-Tzong Chen I-Shou Chang K. S. Clifford Chao Tsang-Wu Liu |
author_sort | Hong-Jie Dai |
collection | DOAJ |
description | Cancer registries are critical databases for cancer research whose maintenance requires various types of domain knowledge with labor-intensive data curation. In order to facilitate the curation process with high quality in a timely manner, we developed a hybrid neural symbolic system for cancer registry coding. Unlike previous works which mainly worked on the dataset collected from one hospital or formulated the task as text classification problems, we collaborated with two medical centers in Taiwan to compile a cross-hospital corpus and applied neural networks to extract cancer registry variables described in unstructured pathology reports along with expert systems for generating registry coding. We conducted experiments to study the feasibility of the proposed hybrid for the task of cancer registry coding and compare its performance with state-of-the-art non-hybrid approaches. Furthermore, cross-hospital experiments were performed to study the advantages and limitations of transfer learning for processing reports from different sources. The experiment results demonstrated that the proposed hybrid neural symbolic system is a robust approach which works well across hospitals and outperformed classification-based baselines by F-scores of 0.13~0.27. Compared to the baseline models, the F-scores of the proposed approaches are apparently higher when fewer training instances were used. All methods benefited from the transferred parameters learned from the source dataset, but the results suggest that it is a better strategy to transfer the learned knowledge through the concept recognition task followed by the symbolic expert system to address the task of cancer registry coding. |
first_indexed | 2024-12-20T00:09:25Z |
format | Article |
id | doaj.art-eced4a6f45294384a9d0954e5b72153c |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-20T00:09:25Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-eced4a6f45294384a9d0954e5b72153c2022-12-21T20:00:33ZengIEEEIEEE Access2169-35362021-01-01911208111209610.1109/ACCESS.2021.30991759493200Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital SettingHong-Jie Dai0https://orcid.org/0000-0002-1516-7255Yi-Hsin Yang1Ti-Hao Wang2https://orcid.org/0000-0002-2105-0133Yan-Jie Lin3Pin-Jou Lu4Chung-Yang Wu5Yu-Cheng Chang6You-Qian Lee7You-Chen Zhang8Yuan-Chi Hsu9Han-Hsiang Wu10Cheng-Rong Ke11Chih-Jen Huang12Yu-Tsang Wang13Sheau-Fang Yang14Kuan-Chung Hsiao15Ko-Jiunn Liu16https://orcid.org/0000-0001-6695-9159Li-Tzong Chen17I-Shou Chang18K. S. Clifford Chao19Tsang-Wu Liu20Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanNational Institute of Cancer Research, National Health Research Institutes, Tainan, TaiwanDepartment of Radiation Oncology, China Medical University Hospital, China Medical University, Taichung, TaiwanInstitute of Population Health Sciences, National Health Research Institutes, Miaoli, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanDepartment of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, TaiwanCancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, TaiwanDepartment of Medical Research, Division of Medical Statistics and Bioinformatics, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, TaiwanDepartment of Pathology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, TaiwanNational Institute of Cancer Research, National Health Research Institutes, Tainan, TaiwanNational Institute of Cancer Research, National Health Research Institutes, Tainan, TaiwanNational Institute of Cancer Research, National Health Research Institutes, Tainan, TaiwanInstitute of Population Health Sciences, National Health Research Institutes, Miaoli, TaiwanCancer Center, China Medical University Hospital, China Medical University, Taichung, TaiwanNational Institute of Cancer Research, National Health Research Institutes, Miaoli, TaiwanCancer registries are critical databases for cancer research whose maintenance requires various types of domain knowledge with labor-intensive data curation. In order to facilitate the curation process with high quality in a timely manner, we developed a hybrid neural symbolic system for cancer registry coding. Unlike previous works which mainly worked on the dataset collected from one hospital or formulated the task as text classification problems, we collaborated with two medical centers in Taiwan to compile a cross-hospital corpus and applied neural networks to extract cancer registry variables described in unstructured pathology reports along with expert systems for generating registry coding. We conducted experiments to study the feasibility of the proposed hybrid for the task of cancer registry coding and compare its performance with state-of-the-art non-hybrid approaches. Furthermore, cross-hospital experiments were performed to study the advantages and limitations of transfer learning for processing reports from different sources. The experiment results demonstrated that the proposed hybrid neural symbolic system is a robust approach which works well across hospitals and outperformed classification-based baselines by F-scores of 0.13~0.27. Compared to the baseline models, the F-scores of the proposed approaches are apparently higher when fewer training instances were used. All methods benefited from the transferred parameters learned from the source dataset, but the results suggest that it is a better strategy to transfer the learned knowledge through the concept recognition task followed by the symbolic expert system to address the task of cancer registry coding.https://ieeexplore.ieee.org/document/9493200/Electronic medical recordsmedical expert systemsmedical information systemsnatural language processing |
spellingShingle | Hong-Jie Dai Yi-Hsin Yang Ti-Hao Wang Yan-Jie Lin Pin-Jou Lu Chung-Yang Wu Yu-Cheng Chang You-Qian Lee You-Chen Zhang Yuan-Chi Hsu Han-Hsiang Wu Cheng-Rong Ke Chih-Jen Huang Yu-Tsang Wang Sheau-Fang Yang Kuan-Chung Hsiao Ko-Jiunn Liu Li-Tzong Chen I-Shou Chang K. S. Clifford Chao Tsang-Wu Liu Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting IEEE Access Electronic medical records medical expert systems medical information systems natural language processing |
title | Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting |
title_full | Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting |
title_fullStr | Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting |
title_full_unstemmed | Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting |
title_short | Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital Setting |
title_sort | cancer registry coding via hybrid neural symbolic systems in the cross hospital setting |
topic | Electronic medical records medical expert systems medical information systems natural language processing |
url | https://ieeexplore.ieee.org/document/9493200/ |
work_keys_str_mv | AT hongjiedai cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT yihsinyang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT tihaowang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT yanjielin cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT pinjoulu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT chungyangwu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT yuchengchang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT youqianlee cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT youchenzhang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT yuanchihsu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT hanhsiangwu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT chengrongke cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT chihjenhuang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT yutsangwang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT sheaufangyang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT kuanchunghsiao cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT kojiunnliu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT litzongchen cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT ishouchang cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT kscliffordchao cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting AT tsangwuliu cancerregistrycodingviahybridneuralsymbolicsystemsinthecrosshospitalsetting |