A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning

Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine,...

Full description

Bibliographic Details
Main Authors: Shafqat Ul Ahsaan, Harleen Kaur, Ashish Kumar Mourya, Sameena Naaz
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/14/11/2344
_version_ 1797466406972817408
author Shafqat Ul Ahsaan
Harleen Kaur
Ashish Kumar Mourya
Sameena Naaz
author_facet Shafqat Ul Ahsaan
Harleen Kaur
Ashish Kumar Mourya
Sameena Naaz
author_sort Shafqat Ul Ahsaan
collection DOAJ
description Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers.
first_indexed 2024-03-09T18:36:29Z
format Article
id doaj.art-b4274f1291904764b0ff86dcfa3d779e
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-09T18:36:29Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-b4274f1291904764b0ff86dcfa3d779e2023-11-24T07:08:49ZengMDPI AGSymmetry2073-89942022-11-011411234410.3390/sym14112344A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine LearningShafqat Ul Ahsaan0Harleen Kaur1Ashish Kumar Mourya2Sameena Naaz3Department of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaBig data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers.https://www.mdpi.com/2073-8994/14/11/2344big dataEuclidean distanceheterogeneityheterogeneous Euclidean overlap metric (HEOM)hybrid support vector machine (H-SVM)k-nearest neighbor (kNN)
spellingShingle Shafqat Ul Ahsaan
Harleen Kaur
Ashish Kumar Mourya
Sameena Naaz
A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
Symmetry
big data
Euclidean distance
heterogeneity
heterogeneous Euclidean overlap metric (HEOM)
hybrid support vector machine (H-SVM)
k-nearest neighbor (kNN)
title A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_full A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_fullStr A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_full_unstemmed A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_short A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
title_sort hybrid support vector machine algorithm for big data heterogeneity using machine learning
topic big data
Euclidean distance
heterogeneity
heterogeneous Euclidean overlap metric (HEOM)
hybrid support vector machine (H-SVM)
k-nearest neighbor (kNN)
url https://www.mdpi.com/2073-8994/14/11/2344
work_keys_str_mv AT shafqatulahsaan ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT harleenkaur ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT ashishkumarmourya ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT sameenanaaz ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT shafqatulahsaan hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT harleenkaur hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT ashishkumarmourya hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning
AT sameenanaaz hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning