Large Dataset Classification Using Parallel Processing Concept

Much attention has been paid to large data technologies in the past few years mainly due to its capability to impact business analytics and data mining practices, as well as the possibility of influencing an ambit of a highly effective decision-making tools. With the current increase in the number o...

Full description

Bibliographic Details
Main Authors:	Aljanabi, Mohammad, Ebraheem, Hind Ra'ad, Hussain, Zahraa Faiz, Mohd Farhan, Md Fudzee, Shahreen, Kasim, Mohd Arfian, Ismail, Meidelfie, Dwiny, Eriandae, Aldo
Format:	Article
Language:	English
Published:	Department of Information Technology - Politeknik Negeri Padang 2020
Subjects:	QA Mathematics QA75 Electronic computers. Computer science
Online Access:	http://umpir.ump.edu.my/id/eprint/30480/1/Large%20Dataset%20Classification.pdf

_version_	1825813691400978432
author	Aljanabi, Mohammad Ebraheem, Hind Ra'ad Hussain, Zahraa Faiz Mohd Farhan, Md Fudzee Shahreen, Kasim Mohd Arfian, Ismail Meidelfie, Dwiny Eriandae, Aldo
author_facet	Aljanabi, Mohammad Ebraheem, Hind Ra'ad Hussain, Zahraa Faiz Mohd Farhan, Md Fudzee Shahreen, Kasim Mohd Arfian, Ismail Meidelfie, Dwiny Eriandae, Aldo
author_sort	Aljanabi, Mohammad
collection	UMP
description	Much attention has been paid to large data technologies in the past few years mainly due to its capability to impact business analytics and data mining practices, as well as the possibility of influencing an ambit of a highly effective decision-making tools. With the current increase in the number of modern applications (including social media and other web-based and healthcare applications) which generates high data in different forms and volume, the processing of such huge data volume is becoming a challenge with the conventional data processing tools. This has resulted in the emergence of big data analytics which also comes with many challenges. This paper introduced the use of principal components analysis (PCA) for data size reduction, followed by SVM parallelization. The proposed scheme in this study was executed on the Spark platform and the experimental findings revealed the capability of the proposed scheme to reduce the classifiers’ classification time without much influence on the classification accuracy of the classifier.
first_indexed	2024-03-06T12:47:46Z
format	Article
id	UMPir30480
institution	Universiti Malaysia Pahang
language	English
last_indexed	2024-03-06T12:47:46Z
publishDate	2020
publisher	Department of Information Technology - Politeknik Negeri Padang
record_format	dspace
spelling	UMPir304802021-01-12T07:36:30Z http://umpir.ump.edu.my/id/eprint/30480/ Large Dataset Classification Using Parallel Processing Concept Aljanabi, Mohammad Ebraheem, Hind Ra'ad Hussain, Zahraa Faiz Mohd Farhan, Md Fudzee Shahreen, Kasim Mohd Arfian, Ismail Meidelfie, Dwiny Eriandae, Aldo QA Mathematics QA75 Electronic computers. Computer science Much attention has been paid to large data technologies in the past few years mainly due to its capability to impact business analytics and data mining practices, as well as the possibility of influencing an ambit of a highly effective decision-making tools. With the current increase in the number of modern applications (including social media and other web-based and healthcare applications) which generates high data in different forms and volume, the processing of such huge data volume is becoming a challenge with the conventional data processing tools. This has resulted in the emergence of big data analytics which also comes with many challenges. This paper introduced the use of principal components analysis (PCA) for data size reduction, followed by SVM parallelization. The proposed scheme in this study was executed on the Spark platform and the experimental findings revealed the capability of the proposed scheme to reduce the classifiers’ classification time without much influence on the classification accuracy of the classifier. Department of Information Technology - Politeknik Negeri Padang 2020 Article PeerReviewed pdf en cc_by_sa_4 http://umpir.ump.edu.my/id/eprint/30480/1/Large%20Dataset%20Classification.pdf Aljanabi, Mohammad and Ebraheem, Hind Ra'ad and Hussain, Zahraa Faiz and Mohd Farhan, Md Fudzee and Shahreen, Kasim and Mohd Arfian, Ismail and Meidelfie, Dwiny and Eriandae, Aldo (2020) Large Dataset Classification Using Parallel Processing Concept. JOIV: International Journal on Informatics Visualization, 4 (4). pp. 191-194. ISSN 2549-9904. (Published) http://dx.doi.org/10.30630/joiv.4.4.361 http://dx.doi.org/10.30630/joiv.4.4.361
spellingShingle	QA Mathematics QA75 Electronic computers. Computer science Aljanabi, Mohammad Ebraheem, Hind Ra'ad Hussain, Zahraa Faiz Mohd Farhan, Md Fudzee Shahreen, Kasim Mohd Arfian, Ismail Meidelfie, Dwiny Eriandae, Aldo Large Dataset Classification Using Parallel Processing Concept
title	Large Dataset Classification Using Parallel Processing Concept
title_full	Large Dataset Classification Using Parallel Processing Concept
title_fullStr	Large Dataset Classification Using Parallel Processing Concept
title_full_unstemmed	Large Dataset Classification Using Parallel Processing Concept
title_short	Large Dataset Classification Using Parallel Processing Concept
title_sort	large dataset classification using parallel processing concept
topic	QA Mathematics QA75 Electronic computers. Computer science
url	http://umpir.ump.edu.my/id/eprint/30480/1/Large%20Dataset%20Classification.pdf
work_keys_str_mv	AT aljanabimohammad largedatasetclassificationusingparallelprocessingconcept AT ebraheemhindraad largedatasetclassificationusingparallelprocessingconcept AT hussainzahraafaiz largedatasetclassificationusingparallelprocessingconcept AT mohdfarhanmdfudzee largedatasetclassificationusingparallelprocessingconcept AT shahreenkasim largedatasetclassificationusingparallelprocessingconcept AT mohdarfianismail largedatasetclassificationusingparallelprocessingconcept AT meidelfiedwiny largedatasetclassificationusingparallelprocessingconcept AT eriandaealdo largedatasetclassificationusingparallelprocessingconcept

Large Dataset Classification Using Parallel Processing Concept

Similar Items