A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine,...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/14/11/2344 |
_version_ | 1797466406972817408 |
---|---|
author | Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz |
author_facet | Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz |
author_sort | Shafqat Ul Ahsaan |
collection | DOAJ |
description | Big data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers. |
first_indexed | 2024-03-09T18:36:29Z |
format | Article |
id | doaj.art-b4274f1291904764b0ff86dcfa3d779e |
institution | Directory Open Access Journal |
issn | 2073-8994 |
language | English |
last_indexed | 2024-03-09T18:36:29Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Symmetry |
spelling | doaj.art-b4274f1291904764b0ff86dcfa3d779e2023-11-24T07:08:49ZengMDPI AGSymmetry2073-89942022-11-011411234410.3390/sym14112344A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine LearningShafqat Ul Ahsaan0Harleen Kaur1Ashish Kumar Mourya2Sameena Naaz3Department of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaDepartment of Computer Science, Jamia Hamdard University, New Delhi 110062, IndiaBig data technology has gained attention in all fields, particularly with regard to research and financial institutions. This technology has changed the world tremendously. Researchers and data scientists are currently working on its applicability in different domains such as health care, medicine, and the stock market, among others. The data being generated at an unexpected pace from multiple sources like social media, health care contexts, and Internet of things have given rise to big data. Management and processing of big data represent a challenge for researchers and data scientists, as there is heterogeneity and ambiguity. Heterogeneity is considered to be an important characteristic of big data. The analysis of heterogeneous data is a very complex task as it involves the compilation, storage, and processing of varied data based on diverse patterns and rules. The proposed research has focused on the heterogeneity problem in big data. This research introduces the hybrid support vector machine (H-SVM) classifier, which uses the support vector machine as a base. In the proposed algorithm, the heterogeneous Euclidean overlap metric (HEOM) and Euclidean distance are introduced to form clusters and classify the data on the basis of ordinal and nominal values. The performance of the proposed learning classifier is compared with linear SVM, random forest, and k-nearest neighbor. The proposed algorithm attained the highest accuracy as compared to other classifiers.https://www.mdpi.com/2073-8994/14/11/2344big dataEuclidean distanceheterogeneityheterogeneous Euclidean overlap metric (HEOM)hybrid support vector machine (H-SVM)k-nearest neighbor (kNN) |
spellingShingle | Shafqat Ul Ahsaan Harleen Kaur Ashish Kumar Mourya Sameena Naaz A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning Symmetry big data Euclidean distance heterogeneity heterogeneous Euclidean overlap metric (HEOM) hybrid support vector machine (H-SVM) k-nearest neighbor (kNN) |
title | A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning |
title_full | A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning |
title_fullStr | A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning |
title_full_unstemmed | A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning |
title_short | A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning |
title_sort | hybrid support vector machine algorithm for big data heterogeneity using machine learning |
topic | big data Euclidean distance heterogeneity heterogeneous Euclidean overlap metric (HEOM) hybrid support vector machine (H-SVM) k-nearest neighbor (kNN) |
url | https://www.mdpi.com/2073-8994/14/11/2344 |
work_keys_str_mv | AT shafqatulahsaan ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT harleenkaur ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT ashishkumarmourya ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT sameenanaaz ahybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT shafqatulahsaan hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT harleenkaur hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT ashishkumarmourya hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning AT sameenanaaz hybridsupportvectormachinealgorithmforbigdataheterogeneityusingmachinelearning |