Generating Synthetic Training Data for Supervised De-Identification of Electronic Health Records
A major hurdle in the development of natural language processing (NLP) methods for Electronic Health Records (EHRs) is the lack of large, annotated datasets. Privacy concerns prevent the distribution of EHRs, and the annotation of data is known to be costly and cumbersome. Synthetic data presents a...
Main Authors: | Claudia Alessandra Libbi, Jan Trienes, Dolf Trieschnigg, Christin Seifert |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-05-01
|
Series: | Future Internet |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-5903/13/5/136 |
Similar Items
-
NERSkill.Id: Annotated dataset of Indonesian's skill entity recognition
by: Meilany Nonsi Tentua, et al.
Published: (2024-04-01) -
Named Entity Recognition Utilized to Enhance Text Classification While Preserving Privacy
by: Mohammed Kutbi
Published: (2023-01-01) -
Deep learning with language models improves named entity recognition for PharmaCoNER
by: Cong Sun, et al.
Published: (2021-12-01) -
Natural Language Processing to Extract Information from Portuguese-Language Medical Records
by: Naila Camila da Rocha, et al.
Published: (2022-12-01) -
Thai Named Entity Recognition Using BiLSTM-CNN-CRF Enhanced by TCC
by: Virach Sornlertlamvanich, et al.
Published: (2022-01-01)