Efficient replication of large data objects
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2006
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/29581 |
_version_ | 1826210079690457088 |
---|---|
author | Fan, Rui, 1977- |
author2 | Nancy A. Lynch. |
author_facet | Nancy A. Lynch. Fan, Rui, 1977- |
author_sort | Fan, Rui, 1977- |
collection | MIT |
description | Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. |
first_indexed | 2024-09-23T14:42:07Z |
format | Thesis |
id | mit-1721.1/29581 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T14:42:07Z |
publishDate | 2006 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/295812019-04-12T15:56:09Z Efficient replication of large data objects Fan, Rui, 1977- Nancy A. Lynch. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. Includes bibliographical references (p. 77-78). Replication is an important technique for improving the reliability and scalability of data services. The primary problem encountered in replication is the trade-off between amount of replication, performance, and consistency. A rule of thumb states that any replication algorithm must sacrifice at least one of these criteria. In this thesis, we investigate replicating large data objects, such as files, whose size is large compared to metadata used by the replication algorithm. With this assumption, we present a distributed replication algorithm which simultaneously achieves a high replication factor, nearly optimal performance, and strong data consistency. Furthermore, our algorithm makes only basic assumptions about its environment. Our algorithm works in any asynchronous, reliable message-passing network, without relying on higher level functions such as distributed locking or group communication. Our algorithm is suitable for implementation in both LAN and WAN settings. This thesis is divided into two parts. In the first part, we formally state the assumptions and guarantees of our replication algorithm in terms of its trace properties. We then formally implement our algorithm in the IOA modeling language. We also give rigorous proofs of the algorithm's correctness and its performance analysis. The main idea of our algorithm is to separately maintain copies of the data, and information about the locations of the up-to-date copies. Our algorithm then mostly performs cheap operations on the location information, and avoids expensive operations on the actual data. The second part of this thesis presents two lower bounds on the costs of data replication. The first lower bound gives the minimum number of writes that must occur during a read operation. The second lower bound states that for a certain class of efficient replication algorithms, the replicas must use storage proportional to the maximum number of concurrent writers. The motivation for these lower bounds was certain algorithmic techniques we used in our replication algorithm. The lower bounds suggest that these techniques are necessary. The lower bounds are also of independent interest. by Rui Fan. S.M. 2006-03-24T16:04:29Z 2006-03-24T16:04:29Z 2003 2003 Thesis http://hdl.handle.net/1721.1/29581 52758605 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 78 p. 3363563 bytes 3363369 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Fan, Rui, 1977- Efficient replication of large data objects |
title | Efficient replication of large data objects |
title_full | Efficient replication of large data objects |
title_fullStr | Efficient replication of large data objects |
title_full_unstemmed | Efficient replication of large data objects |
title_short | Efficient replication of large data objects |
title_sort | efficient replication of large data objects |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/29581 |
work_keys_str_mv | AT fanrui1977 efficientreplicationoflargedataobjects |