Data transmission performance analysis in cloud and grid
Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid s...
Main Authors: | , |
---|---|
Format: | Article |
Published: |
Asian Research Publishing Network
2015
|
_version_ | 1796974488638717952 |
---|---|
author | Abdulkarem, Mohammed Latip, Rohaya |
author_facet | Abdulkarem, Mohammed Latip, Rohaya |
author_sort | Abdulkarem, Mohammed |
collection | UPM |
description | Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid storage solution in the field of scientific computing. However, due to the nature of HDFS that it cannot support asynchronous write, it is widely confirmed that for the case of sustained high throughput in WAN transfer, single stream per GridFTP transfer is the best solution. GridFTP, designed by using Globus, is one of the most popular protocols for performing data transfers in the Grid environment. In this paper, we take on the challenge of integrating Hadoop with grid, by proposing a new framework called Grid-over-Hadoop by retaining the features of Hadoop and using GridFTP for data transfer. |
first_indexed | 2024-03-06T08:57:48Z |
format | Article |
id | upm.eprints-44237 |
institution | Universiti Putra Malaysia |
last_indexed | 2024-03-06T08:57:48Z |
publishDate | 2015 |
publisher | Asian Research Publishing Network |
record_format | dspace |
spelling | upm.eprints-442372023-11-15T08:54:53Z http://psasir.upm.edu.my/id/eprint/44237/ Data transmission performance analysis in cloud and grid Abdulkarem, Mohammed Latip, Rohaya Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid storage solution in the field of scientific computing. However, due to the nature of HDFS that it cannot support asynchronous write, it is widely confirmed that for the case of sustained high throughput in WAN transfer, single stream per GridFTP transfer is the best solution. GridFTP, designed by using Globus, is one of the most popular protocols for performing data transfers in the Grid environment. In this paper, we take on the challenge of integrating Hadoop with grid, by proposing a new framework called Grid-over-Hadoop by retaining the features of Hadoop and using GridFTP for data transfer. Asian Research Publishing Network 2015 Article PeerReviewed Abdulkarem, Mohammed and Latip, Rohaya (2015) Data transmission performance analysis in cloud and grid. ARPN Journal of Engineering and Applied Sciences, 10 (18). pp. 8451-8457. ISSN 1819-6608 https://www.arpnjournals.com/jeas/volume_18_2015.htm |
spellingShingle | Abdulkarem, Mohammed Latip, Rohaya Data transmission performance analysis in cloud and grid |
title | Data transmission performance analysis in cloud and grid |
title_full | Data transmission performance analysis in cloud and grid |
title_fullStr | Data transmission performance analysis in cloud and grid |
title_full_unstemmed | Data transmission performance analysis in cloud and grid |
title_short | Data transmission performance analysis in cloud and grid |
title_sort | data transmission performance analysis in cloud and grid |
work_keys_str_mv | AT abdulkaremmohammed datatransmissionperformanceanalysisincloudandgrid AT latiprohaya datatransmissionperformanceanalysisincloudandgrid |