Monetary efficiency in the dynamic public cloud environments

As the popularity of cloud computing grows, public cloud providers (e.g., Amazon AWS) offer many cloud services to users. Infrastructure-as-a-Service (IaaS) is one of the services that provide many elasticities and flexibilities for users to run their systems in the cloud. Therefore, more and more u...

Full description

Bibliographic Details
Main Author: Chen, Changbing
Other Authors: Lee Bu Sung
Format: Thesis
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/63270
Description
Summary:As the popularity of cloud computing grows, public cloud providers (e.g., Amazon AWS) offer many cloud services to users. Infrastructure-as-a-Service (IaaS) is one of the services that provide many elasticities and flexibilities for users to run their systems in the cloud. Therefore, more and more users today are willing to deploy their systems to the cloud. Those systems are always run at internet scale (i.e., running in a set of networked servers). Users pay by what they have consumed according to the pricing schemes predefined by cloud providers. As the systems evolve, the monetary cost of running those systems has become very high, which cannot be ignored. Recently, there are many researches focusing on cloud pricing, cloud resource man- agement and allocation, while there is less work on improving the monetary efficiency (i.e., number of job done per dollar) of running large-scale systems in the dynamic cloud environments, specially when system failures may occur with a higher probability and in an unpredictable manner. In this thesis, we seek to address the problem of how to improve the monetary efficiency of running large-scale systems in the cloud. MapReduce is a scalable, fault-tolerant, parallel and distributed programming frame- work which has dominated the area of big data analytics and processing. Hadoop, one of the most prevalent open source MapReduce implementations, has been adapted to run in cloud environments (e.g., Amazon EC2). Thus, in this thesis, we first conduct experiments on the dynamics of public cloud environments and deployment of MapRe- duce system in the cloud. Second, we take Hadoop running on Amazon EC2 as an example to carry out a case study to improve the monetary efficiency of the Hadoop system in the cloud. In particular, we conduct detailed study on improving the mon- etary efficiency by leveraging spot instances. We take a cloud broker’s perspective to propose a price-aware virtual machine auto-scaling with migration algorithm, called MaxME, to improve Hadoop’s monetary efficiency on Amazon EC2. We evaluate our proposed algorithm through simulation using Amazon EC2 spot price traces and real workload traces. Compared with other baseline algorithms, our approach can improve the monetary efficiency by up to 9.3x with at most 20% performance degradation.