Semantic annotation of a Japanese speech corpus
This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference Paper |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/83172 http://hdl.handle.net/10220/6434 |
Summary: | This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main verbs and their complements in the same utterance. Our semantic tagset is taken from NTT's Goi-Taikei semantic lexicon and ontology (Ikehara et al., 1997). A pilot study demonstrates that the verb sense tagging can be efficiently performed by native Japanese speakers using computergenerated HTML forms, and that good interannotator reliability can be obtained in the right conditions. |
---|