On the weakest failure detector ever

Many problems in distributed computing are impossible to solve when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. Th...

Full description

Bibliographic Details
Main Authors:	Kuznetsov, Petr, Herlihy, Maurice, Newport, Calvin Charles, Lynch, Nancy Ann, Guerraoui, Rachid
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Springer Berlin Heidelberg 2010
Online Access:	http://hdl.handle.net/1721.1/51039 https://orcid.org/0000-0003-3045-265X

_version_	1811080887042834432
author	Kuznetsov, Petr Herlihy, Maurice Newport, Calvin Charles Lynch, Nancy Ann Guerraoui, Rachid
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Kuznetsov, Petr Herlihy, Maurice Newport, Calvin Charles Lynch, Nancy Ann Guerraoui, Rachid
author_sort	Kuznetsov, Petr
collection	MIT
description	Many problems in distributed computing are impossible to solve when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is necessary to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet non-trivial failure information. We present an abstraction, denoted $${\Upsilon}$$ , that provides very little information about failures. In every run of the distributed system, $${\Upsilon}$$ eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period of time, and it eventually excludes only one set of processes (among many) that is not the set of correct processes in the current run, $${\Upsilon}$$ still captures non-trivial failure information. We show that $${\Upsilon}$$ is sufficient to circumvent the fundamental wait-free set-agreement impossibility. While doing so, (a) we disprove previous conjectures about the weakest failure detector to solve set-agreement and (b) we prove that solving set-agreement with registers is strictly weaker than solving n + 1-process consensus using n-process consensus. We show that $${\Upsilon}$$ is the weakest stable non-trivial failure detector: any stable failure detector that circumvents some wait-free impossibility provides at least as much information about failures as $${\Upsilon}$$ does. Our results are generalized, from the wait-free to the f-resilient case, through an abstraction $${\Upsilon^f}$$ that we introduce and prove minimal to solve any problem that cannot be solved in an f-resilient manner, and yet sufficient to solve f-resilient f-set-agreement.
first_indexed	2024-09-23T11:38:27Z
format	Article
id	mit-1721.1/51039
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T11:38:27Z
publishDate	2010
publisher	Springer Berlin Heidelberg
record_format	dspace
spelling	mit-1721.1/510392022-10-01T04:59:12Z On the weakest failure detector ever Kuznetsov, Petr Herlihy, Maurice Newport, Calvin Charles Lynch, Nancy Ann Guerraoui, Rachid Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Lynch, Nancy Ann Lynch, Nancy Ann Guerraoui, Rachid Newport, Calvin Charles Many problems in distributed computing are impossible to solve when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is necessary to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet non-trivial failure information. We present an abstraction, denoted $${\Upsilon}$$ , that provides very little information about failures. In every run of the distributed system, $${\Upsilon}$$ eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period of time, and it eventually excludes only one set of processes (among many) that is not the set of correct processes in the current run, $${\Upsilon}$$ still captures non-trivial failure information. We show that $${\Upsilon}$$ is sufficient to circumvent the fundamental wait-free set-agreement impossibility. While doing so, (a) we disprove previous conjectures about the weakest failure detector to solve set-agreement and (b) we prove that solving set-agreement with registers is strictly weaker than solving n + 1-process consensus using n-process consensus. We show that $${\Upsilon}$$ is the weakest stable non-trivial failure detector: any stable failure detector that circumvents some wait-free impossibility provides at least as much information about failures as $${\Upsilon}$$ does. Our results are generalized, from the wait-free to the f-resilient case, through an abstraction $${\Upsilon^f}$$ that we introduce and prove minimal to solve any problem that cannot be solved in an f-resilient manner, and yet sufficient to solve f-resilient f-set-agreement. 2010-01-29T18:51:35Z 2010-01-29T18:51:35Z 2009-01 2008-11 Article http://purl.org/eprint/type/SubmittedJournalArticle 1432-0452 0178-2770 http://hdl.handle.net/1721.1/51039 Guerraoui, Rachid et al. “On the weakest failure detector ever.” Distributed Computing 21.5 (2009): 353-366. https://orcid.org/0000-0003-3045-265X en_US http://dx.doi.org/10.1007/s00446-009-0079-3 Distributed Computing Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Springer Berlin Heidelberg Joanne Hanley
spellingShingle	Kuznetsov, Petr Herlihy, Maurice Newport, Calvin Charles Lynch, Nancy Ann Guerraoui, Rachid On the weakest failure detector ever
title	On the weakest failure detector ever
title_full	On the weakest failure detector ever
title_fullStr	On the weakest failure detector ever
title_full_unstemmed	On the weakest failure detector ever
title_short	On the weakest failure detector ever
title_sort	on the weakest failure detector ever
url	http://hdl.handle.net/1721.1/51039 https://orcid.org/0000-0003-3045-265X
work_keys_str_mv	AT kuznetsovpetr ontheweakestfailuredetectorever AT herlihymaurice ontheweakestfailuredetectorever AT newportcalvincharles ontheweakestfailuredetectorever AT lynchnancyann ontheweakestfailuredetectorever AT guerraouirachid ontheweakestfailuredetectorever

On the weakest failure detector ever

Similar Items