A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning

Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine,...

Full description

Bibliographic Details
Main Authors:	Shafqat Ul Ahsaan, Harleen Kaur, Ashish Kumar Mourya, Sameena Naaz
Format:	Article
Language:	English
Published:	MDPI AG 2022-11-01
Series:	Symmetry
Subjects:	big data Euclidean distance heterogeneity heterogeneous Euclidean overlap metric (HEOM) hybrid support vector machine (H-SVM) k-nearest neighbor (kNN)
Online Access:	https://www.mdpi.com/2073-8994/14/11/2344

_version_	1797466406972817408
author	Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz
author_facet	Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz
author_sort	Shafqat Ul Ahsaan
collection	DOAJ
description	Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers.
first_indexed	2024-03-09T18:36:29Z
format	Article
id	doaj.art-b4274f1291904764b0ff86dcfa3d779e
institution	Directory Open Access Journal
issn	2073-8994
language	English
last_indexed	2024-03-09T18:36:29Z
publishDate	2022-11-01
publisher	MDPI AG
record_format	Article
series	Symmetry
spelling	doaj.art-b4274f1291904764b0ff86dcfa3d779e2023-11-24T07:08:49ZengMDPI AGSymmetry2073-89942022-11-011411234410.3390/sym14112344A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine LearningShafqat Ul Ahsaan0Harleen Kaur1Ashish Kumar Mourya2Sameena Naaz3Department of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaBig data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers.https://www.mdpi.com/2073-8994/14/11/2344big dataEuclidean distanceheterogeneityheterogeneous Euclidean overlap metric (HEOM)hybrid support vector machine (H-SVM)k-nearest neighbor (kNN)
spellingShingle	Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning Symmetry big data Euclidean distance heterogeneity heterogeneous Euclidean overlap metric (HEOM) hybrid support vector machine (H-SVM) k-nearest neighbor (kNN)
title	A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_full	A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_fullStr	A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_full_unstemmed	A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_short	A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_sort	hybrid support vector machine algorithm for big data heterogeneity using machine learning
topic	big data Euclidean distance heterogeneity heterogeneous Euclidean overlap metric (HEOM) hybrid support vector machine (H-SVM) k-nearest neighbor (kNN)
url	https://www.mdpi.com/2073-8994/14/11/2344
work_keys_str_mv	AT shafqatulahsaan ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT harleenkaur ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT ashishkumarmourya ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT sameenanaaz ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT shafqatulahsaan hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT harleenkaur hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT ashishkumarmourya hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT sameenanaaz hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning

A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning

Similar Items