Data transmission performance analysis in cloud and grid

Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid s...

Full description

Bibliographic Details
Main Authors: Abdulkarem, Mohammed, Latip, Rohaya
Format: Article
Published: Asian Research Publishing Network 2015
_version_ 1796974488638717952
author Abdulkarem, Mohammed
Latip, Rohaya
author_facet Abdulkarem, Mohammed
Latip, Rohaya
author_sort Abdulkarem, Mohammed
collection UPM
description Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid storage solution in the field of scientific computing. However, due to the nature of HDFS that it cannot support asynchronous write, it is widely confirmed that for the case of sustained high throughput in WAN transfer, single stream per GridFTP transfer is the best solution. GridFTP, designed by using Globus, is one of the most popular protocols for performing data transfers in the Grid environment. In this paper, we take on the challenge of integrating Hadoop with grid, by proposing a new framework called Grid-over-Hadoop by retaining the features of Hadoop and using GridFTP for data transfer.
first_indexed 2024-03-06T08:57:48Z
format Article
id upm.eprints-44237
institution Universiti Putra Malaysia
last_indexed 2024-03-06T08:57:48Z
publishDate 2015
publisher Asian Research Publishing Network
record_format dspace
spelling upm.eprints-442372023-11-15T08:54:53Z http://psasir.upm.edu.my/id/eprint/44237/ Data transmission performance analysis in cloud and grid Abdulkarem, Mohammed Latip, Rohaya Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid storage solution in the field of scientific computing. However, due to the nature of HDFS that it cannot support asynchronous write, it is widely confirmed that for the case of sustained high throughput in WAN transfer, single stream per GridFTP transfer is the best solution. GridFTP, designed by using Globus, is one of the most popular protocols for performing data transfers in the Grid environment. In this paper, we take on the challenge of integrating Hadoop with grid, by proposing a new framework called Grid-over-Hadoop by retaining the features of Hadoop and using GridFTP for data transfer. Asian Research Publishing Network 2015 Article PeerReviewed Abdulkarem, Mohammed and Latip, Rohaya (2015) Data transmission performance analysis in cloud and grid. ARPN Journal of Engineering and Applied Sciences, 10 (18). pp. 8451-8457. ISSN 1819-6608 https://www.arpnjournals.com/jeas/volume_18_2015.htm
spellingShingle Abdulkarem, Mohammed
Latip, Rohaya
Data transmission performance analysis in cloud and grid
title Data transmission performance analysis in cloud and grid
title_full Data transmission performance analysis in cloud and grid
title_fullStr Data transmission performance analysis in cloud and grid
title_full_unstemmed Data transmission performance analysis in cloud and grid
title_short Data transmission performance analysis in cloud and grid
title_sort data transmission performance analysis in cloud and grid
work_keys_str_mv AT abdulkaremmohammed datatransmissionperformanceanalysisincloudandgrid
AT latiprohaya datatransmissionperformanceanalysisincloudandgrid