Distributed Machine Learning using HDFS and Apache Spark for Big Data Challenges
Hadoop and Apache Spark have become popular frameworks for distributed big data processing. This research aims to configure Hadoop and Spark for conducting training and testing on big data using distributed machine learning methods with MLlib, including linear regression and multi-linear regression....
Main Authors: | Cahya Indirman M. Didik, Wahyu Wiriasto Giri, Irfan Akbar L. Ahmad S. |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2023-01-01
|
Series: | E3S Web of Conferences |
Online Access: | https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02058.pdf |
Similar Items
-
Big Data Analysis Using Apache Spark MLlib and Hadoop HDFS with Scala and Java
by: Hoger Khayrolla Omar, et al.
Published: (2019-05-01) -
A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data
by: Fei Hu, et al.
Published: (2020-03-01) -
Big Data in metagenomics: Apache Spark vs MPI.
by: José M Abuín, et al.
Published: (2020-01-01) -
Big Data Analytics for the ATLAS EventIndex Project with Apache Spark
by: Álvaro Fernández Casaní, et al.
Published: (2023-01-01) -
Mobile big data analytics using deep learning and Apache Spark
by: Niyato, Dusit, et al.
Published: (2016)