Summary: | Narrative reports in medical records contain abundant clinical information that may be converted into structured data for managing patient information and predicting trends in diseases. Though various rule-based and machine-learning methods are available in electronic medical records (EMRs), a few works have explored the hybrid methods in extracting information from the Chinese EMRs. In this paper, we developed a novel hybrid approach which integrates the rules and bidirectional long short-term memory with a conditional random field layer (BiLSTM-CRF) model to extract clinical entities and attributes. A corpus of 1509 electronic notes (discharge summaries and operation notes) was annotated. Annotation from three clinicians was reconciled to form a gold standard dataset. The performance of our method was assessed by calculating the precision, recall, and F-measure for two boundary matching strategies. The experimental results demonstrate the effectiveness of our method in clinical information extraction from the Chinese EMRs.
|