A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks

Islamic content is a broad and diverse domain that encompasses various sources, topics, and perspectives. However, there is a lack of comprehensive and reliable datasets that can facilitate conducting studies on Islamic content. In this paper, we present <i>fatwaset</i>, the first public...

Full description

Bibliographic Details
Main Authors: Ohoud Alyemny, Hend Al-Khalifa, Abdulrahman Mirza
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/8/10/155
_version_ 1827721204467761152
author Ohoud Alyemny
Hend Al-Khalifa
Abdulrahman Mirza
author_facet Ohoud Alyemny
Hend Al-Khalifa
Abdulrahman Mirza
author_sort Ohoud Alyemny
collection DOAJ
description Islamic content is a broad and diverse domain that encompasses various sources, topics, and perspectives. However, there is a lack of comprehensive and reliable datasets that can facilitate conducting studies on Islamic content. In this paper, we present <i>fatwaset</i>, the first public Arabic dataset of Islamic fatwas. It contains Islamic fatwas that we collected from various trusted and authenticated sources in the Islamic fatwa domain, such as agencies, religious scholars, and websites. <i>Fatwaset</i> is a rich resource as it does not only contain fatwas but also includes a considerable set of their surrounding metadata. It can be used for many natural language processing (NLP) tasks, such as language modeling, question answering, author attribution, topic identification, text classification, and text summarization. It can also support other domains that are related to Islamic culture, such as philosophy and language art. We describe the methodology and criteria we used to select the content, as well as the challenges and limitations we faced. Additionally, we perform an Exploratory Data Analysis (EDA), which investigates the dataset from different perspectives. The results of the EDA reveal important information that greatly benefits researchers in this area.
first_indexed 2024-03-10T21:19:14Z
format Article
id doaj.art-143b07e95ab64a2eb9d0f10dc7108b50
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-03-10T21:19:14Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-143b07e95ab64a2eb9d0f10dc7108b502023-11-19T16:11:31ZengMDPI AGData2306-57292023-10-0181015510.3390/data8100155A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP TasksOhoud Alyemny0Hend Al-Khalifa1Abdulrahman Mirza2Department of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh 12372, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, Riyadh 12372, Saudi ArabiaDepartment of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh 12372, Saudi ArabiaIslamic content is a broad and diverse domain that encompasses various sources, topics, and perspectives. However, there is a lack of comprehensive and reliable datasets that can facilitate conducting studies on Islamic content. In this paper, we present <i>fatwaset</i>, the first public Arabic dataset of Islamic fatwas. It contains Islamic fatwas that we collected from various trusted and authenticated sources in the Islamic fatwa domain, such as agencies, religious scholars, and websites. <i>Fatwaset</i> is a rich resource as it does not only contain fatwas but also includes a considerable set of their surrounding metadata. It can be used for many natural language processing (NLP) tasks, such as language modeling, question answering, author attribution, topic identification, text classification, and text summarization. It can also support other domains that are related to Islamic culture, such as philosophy and language art. We describe the methodology and criteria we used to select the content, as well as the challenges and limitations we faced. Additionally, we perform an Exploratory Data Analysis (EDA), which investigates the dataset from different perspectives. The results of the EDA reveal important information that greatly benefits researchers in this area.https://www.mdpi.com/2306-5729/8/10/155fatwasexploratory data analysisIslamic contentnatural language processing
spellingShingle Ohoud Alyemny
Hend Al-Khalifa
Abdulrahman Mirza
A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
Data
fatwas
exploratory data analysis
Islamic content
natural language processing
title A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
title_full A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
title_fullStr A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
title_full_unstemmed A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
title_short A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks
title_sort data driven exploration of a new islamic fatwas dataset for arabic nlp tasks
topic fatwas
exploratory data analysis
Islamic content
natural language processing
url https://www.mdpi.com/2306-5729/8/10/155
work_keys_str_mv AT ohoudalyemny adatadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks
AT hendalkhalifa adatadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks
AT abdulrahmanmirza adatadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks
AT ohoudalyemny datadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks
AT hendalkhalifa datadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks
AT abdulrahmanmirza datadrivenexplorationofanewislamicfatwasdatasetforarabicnlptasks