Summary: | Judicial named entity recognition (JNER) is a basic task of judicial intelligence and judicial service informatization. At present, the research of JNER has attracted extensive attention. However, the existing JNER methods usually can only assign a single label to a token in the input sequence, which is not applicable to nested entities where a token may be assigned two or more different labels at the same time. Therefore, this paper introduces the machine reading comprehension (MRC) framework into JNER, and proposes a judicial nested NER method based on the MRC. Firstly, we design the question template according to the characteristics of judicial nested named entities, and construct the legal text named entity dataset in MRC format. Next, we introduce the span extraction MRC model based on the pre-trained to encode the question and text, and learn the context knowledge of the entity in the question. Finally, we extract the starting and end positions of the matching span respectively through two classifiers, to get the corresponding entities. The experimental results on the information extraction dataset in “CAIL2021” show, compared with the existing baseline models, the proposed method effectively improves the recognition effect of nested entities commonly existing in the judicial field.
|