ANAD: Arabic news article dataset
In this paper, we present a modern standard Arabic dataset based on Arabic news articles collected over a one-year period from 01/01/2021 to 12/31/2021. In total, from 12 Arabic news websites, over 500,000 articles were collected, the selection of which was driven by a variety of topics, including s...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-10-01
|
Series: | Data in Brief |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340923005607 |
_version_ | 1797660464332668928 |
---|---|
author | Mohammed Altamimi Abdulaziz M. Alayba |
author_facet | Mohammed Altamimi Abdulaziz M. Alayba |
author_sort | Mohammed Altamimi |
collection | DOAJ |
description | In this paper, we present a modern standard Arabic dataset based on Arabic news articles collected over a one-year period from 01/01/2021 to 12/31/2021. In total, from 12 Arabic news websites, over 500,000 articles were collected, the selection of which was driven by a variety of topics, including sports, economies, local news, politics, tech, tourism, entertainment, cars, health, and art. The development of this dataset will enable data scientists to explore and experiment effectively in the field of natural language processing, and the dataset can also be used to develop machine learning and deep learning models to classify articles according to topic. The dataset is available for download athttps://github.com/alaybaa/ArabicArticlesDataset/tree/main. |
first_indexed | 2024-03-11T18:31:11Z |
format | Article |
id | doaj.art-ab153bffb8964010a788149ca0dec8f0 |
institution | Directory Open Access Journal |
issn | 2352-3409 |
language | English |
last_indexed | 2024-03-11T18:31:11Z |
publishDate | 2023-10-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj.art-ab153bffb8964010a788149ca0dec8f02023-10-13T11:04:42ZengElsevierData in Brief2352-34092023-10-0150109460ANAD: Arabic news article datasetMohammed Altamimi0Abdulaziz M. Alayba1Corresponding authors.; Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha'il, Ha'il, 81481, Saudi ArabiaCorresponding authors.; Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha'il, Ha'il, 81481, Saudi ArabiaIn this paper, we present a modern standard Arabic dataset based on Arabic news articles collected over a one-year period from 01/01/2021 to 12/31/2021. In total, from 12 Arabic news websites, over 500,000 articles were collected, the selection of which was driven by a variety of topics, including sports, economies, local news, politics, tech, tourism, entertainment, cars, health, and art. The development of this dataset will enable data scientists to explore and experiment effectively in the field of natural language processing, and the dataset can also be used to develop machine learning and deep learning models to classify articles according to topic. The dataset is available for download athttps://github.com/alaybaa/ArabicArticlesDataset/tree/main.http://www.sciencedirect.com/science/article/pii/S2352340923005607Arabic news articlesData analysisClassificationNatural language processing (NLP) |
spellingShingle | Mohammed Altamimi Abdulaziz M. Alayba ANAD: Arabic news article dataset Data in Brief Arabic news articles Data analysis Classification Natural language processing (NLP) |
title | ANAD: Arabic news article dataset |
title_full | ANAD: Arabic news article dataset |
title_fullStr | ANAD: Arabic news article dataset |
title_full_unstemmed | ANAD: Arabic news article dataset |
title_short | ANAD: Arabic news article dataset |
title_sort | anad arabic news article dataset |
topic | Arabic news articles Data analysis Classification Natural language processing (NLP) |
url | http://www.sciencedirect.com/science/article/pii/S2352340923005607 |
work_keys_str_mv | AT mohammedaltamimi anadarabicnewsarticledataset AT abdulazizmalayba anadarabicnewsarticledataset |