Performance Optimization System for Hadoop and Spark Frameworks
The optimization of large-scale data sets depends on the technologies and methods used. The MapReduce model, implemented on Apache Hadoop or Spark, allows splitting large data sets into a set of blocks distributed on several machines. Data compression reduces data size and transfer time between disk...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2020-12-01
|
Series: | Cybernetics and Information Technologies |
Subjects: | |
Online Access: | https://doi.org/10.2478/cait-2020-0056 |