Learning explicit and implicit Arabic discourse relations

We propose in this paper a supervised learning approach to identify discourse relations in Arabic texts. To our knowledge, this work represents the first attempt to focus on both explicit and implicit relations that link adjacent as well as non adjacent Elementary Discourse Units (EDUs) within the S...

Full description

Bibliographic Details
Main Authors:	Iskandar Keskes, Farah Benamara Zitoune, Lamia Hadrich Belguith
Format:	Article
Language:	English
Published:	Elsevier 2014-12-01
Series:	Journal of King Saud University: Computer and Information Sciences
Subjects:	Discourse relations Segmented Discourse Representation Theory Arabic language
Online Access:	http://www.sciencedirect.com/science/article/pii/S1319157814000251

_version_	1829471529425960960
author	Iskandar Keskes Farah Benamara Zitoune Lamia Hadrich Belguith
author_facet	Iskandar Keskes Farah Benamara Zitoune Lamia Hadrich Belguith
author_sort	Iskandar Keskes
collection	DOAJ
description	We propose in this paper a supervised learning approach to identify discourse relations in Arabic texts. To our knowledge, this work represents the first attempt to focus on both explicit and implicit relations that link adjacent as well as non adjacent Elementary Discourse Units (EDUs) within the Segmented Discourse Representation Theory (SDRT). We use the Discourse Arabic Treebank corpus (D-ATB) which is composed of newspaper documents extracted from the syntactically annotated Arabic Treebank v3.2 part3 where each document is associated with complete discourse graph according to the cognitive principles of SDRT. Our list of discourse relations is composed of a three-level hierarchy of 24 relations grouped into 4 top-level classes. To automatically learn them, we use state of the art features whose efficiency has been empirically proved. We investigate how each feature contributes to the learning process. We report our experiments on identifying fine-grained discourse relations, mid-level classes and also top-level classes. We compare our approach with three baselines that are based on the most frequent relation, discourse connectives and the features used by Al-Saif and Markert (2011). Our results are very encouraging and outperform all the baselines with an F-score of 78.1% and an accuracy of 80.6%.
first_indexed	2024-12-14T01:58:38Z
format	Article
id	doaj.art-9025bcbbc5024d44a8088814b57a462c
institution	Directory Open Access Journal
issn	1319-1578
language	English
last_indexed	2024-12-14T01:58:38Z
publishDate	2014-12-01
publisher	Elsevier
record_format	Article
series	Journal of King Saud University: Computer and Information Sciences
spelling	doaj.art-9025bcbbc5024d44a8088814b57a462c2022-12-21T23:21:06ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782014-12-0126439841610.1016/j.jksuci.2014.06.001Learning explicit and implicit Arabic discourse relationsIskandar Keskes0Farah Benamara Zitoune1Lamia Hadrich Belguith2ANLP Research Group, MIRACL Lab-Sfax University, Tunisia & IRIT-Toulouse University, FranceIRIT-Toulouse University, FranceANLP Research Group, MIRACL Lab-Sfax University, TunisiaWe propose in this paper a supervised learning approach to identify discourse relations in Arabic texts. To our knowledge, this work represents the first attempt to focus on both explicit and implicit relations that link adjacent as well as non adjacent Elementary Discourse Units (EDUs) within the Segmented Discourse Representation Theory (SDRT). We use the Discourse Arabic Treebank corpus (D-ATB) which is composed of newspaper documents extracted from the syntactically annotated Arabic Treebank v3.2 part3 where each document is associated with complete discourse graph according to the cognitive principles of SDRT. Our list of discourse relations is composed of a three-level hierarchy of 24 relations grouped into 4 top-level classes. To automatically learn them, we use state of the art features whose efficiency has been empirically proved. We investigate how each feature contributes to the learning process. We report our experiments on identifying fine-grained discourse relations, mid-level classes and also top-level classes. We compare our approach with three baselines that are based on the most frequent relation, discourse connectives and the features used by Al-Saif and Markert (2011). Our results are very encouraging and outperform all the baselines with an F-score of 78.1% and an accuracy of 80.6%.http://www.sciencedirect.com/science/article/pii/S1319157814000251Discourse relationsSegmented Discourse Representation TheoryArabic language
spellingShingle	Iskandar Keskes Farah Benamara Zitoune Lamia Hadrich Belguith Learning explicit and implicit Arabic discourse relations Journal of King Saud University: Computer and Information Sciences Discourse relations Segmented Discourse Representation Theory Arabic language
title	Learning explicit and implicit Arabic discourse relations
title_full	Learning explicit and implicit Arabic discourse relations
title_fullStr	Learning explicit and implicit Arabic discourse relations
title_full_unstemmed	Learning explicit and implicit Arabic discourse relations
title_short	Learning explicit and implicit Arabic discourse relations
title_sort	learning explicit and implicit arabic discourse relations
topic	Discourse relations Segmented Discourse Representation Theory Arabic language
url	http://www.sciencedirect.com/science/article/pii/S1319157814000251
work_keys_str_mv	AT iskandarkeskes learningexplicitandimplicitarabicdiscourserelations AT farahbenamarazitoune learningexplicitandimplicitarabicdiscourserelations AT lamiahadrichbelguith learningexplicitandimplicitarabicdiscourserelations

Learning explicit and implicit Arabic discourse relations

Similar Items