Distributed Machine Learning using HDFS and Apache Spark for Big Data Challenges

Hadoop and Apache Spark have become popular frameworks for distributed big data processing. This research aims to configure Hadoop and Spark for conducting training and testing on big data using distributed machine learning methods with MLlib, including linear regression and multi-linear regression....

Full description

Bibliographic Details
Main Authors: Cahya Indirman M. Didik, Wahyu Wiriasto Giri, Irfan Akbar L. Ahmad S.
Format: Article
Language:English
Published: EDP Sciences 2023-01-01
Series:E3S Web of Conferences
Online Access:https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02058.pdf