DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps

Providing strong fault-tolerant guarantees for the modern cloud is difficult, as application developers must coordinate between independent stateful services and ephemeral compute, and handle various failure-induced anomalies. We propose Composable Resilient Steps (CReSt), a new abstraction for re...

Full description

Bibliographic Details
Main Authors: Li, Tianyu, Chandramouli, Badrish, Burckhardt, Sebastian, Madden, Samuel
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: ACM 2023
Online Access:https://hdl.handle.net/1721.1/151085
_version_ 1826191194900660224
author Li, Tianyu
Chandramouli, Badrish
Burckhardt, Sebastian
Madden, Samuel
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Li, Tianyu
Chandramouli, Badrish
Burckhardt, Sebastian
Madden, Samuel
author_sort Li, Tianyu
collection MIT
description Providing strong fault-tolerant guarantees for the modern cloud is difficult, as application developers must coordinate between independent stateful services and ephemeral compute, and handle various failure-induced anomalies. We propose Composable Resilient Steps (CReSt), a new abstraction for resilient cloud applications. CReSt uses fault-tolerant steps as its core building block, which allows participants receive, process, and send messages as a single uninterruptible atomic unit. Composability and reliability are orthogonally achieved by reusable CReSt implementations, for example, leveraging reliable message queues. Thus, CReSt application builders focus solely on translating application logic into steps, and infrastructure builders focus on efficient CReSt implementations. We propose one such implementation, called DARQ (for Deduplicated Asynchronously Recoverable Queues). At its core, DARQ is a storage service that encapsulates CReSt participant state and enforces CReSt semantics; developers attach ephemeral compute nodes to DARQ instances to implement stateful distributed components. Services built with DARQ are resilient by construction, and CReSt-compatible services naturally compose without loss of resilience. For performance, we propose a novel speculative execution scheme to execute CReSt steps without waiting for message persistence in DARQ, effectively eliding cloud persistence overheads; our scheme maintains CReSt’s fault-tolerance guarantees and automatically restores consistent system state upon failure. We showcase the generality of CReSt and DARQ using two applications: cloud streaming and workflow processing. Experiments show that DARQ is able to achieve extremely low latency and high throughput across these use cases, often beating state-of-the-art customized solutions.
first_indexed 2024-09-23T08:51:56Z
format Article
id mit-1721.1/151085
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T08:51:56Z
publishDate 2023
publisher ACM
record_format dspace
spelling mit-1721.1/1510852024-01-23T18:48:29Z DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps Li, Tianyu Chandramouli, Badrish Burckhardt, Sebastian Madden, Samuel Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Providing strong fault-tolerant guarantees for the modern cloud is difficult, as application developers must coordinate between independent stateful services and ephemeral compute, and handle various failure-induced anomalies. We propose Composable Resilient Steps (CReSt), a new abstraction for resilient cloud applications. CReSt uses fault-tolerant steps as its core building block, which allows participants receive, process, and send messages as a single uninterruptible atomic unit. Composability and reliability are orthogonally achieved by reusable CReSt implementations, for example, leveraging reliable message queues. Thus, CReSt application builders focus solely on translating application logic into steps, and infrastructure builders focus on efficient CReSt implementations. We propose one such implementation, called DARQ (for Deduplicated Asynchronously Recoverable Queues). At its core, DARQ is a storage service that encapsulates CReSt participant state and enforces CReSt semantics; developers attach ephemeral compute nodes to DARQ instances to implement stateful distributed components. Services built with DARQ are resilient by construction, and CReSt-compatible services naturally compose without loss of resilience. For performance, we propose a novel speculative execution scheme to execute CReSt steps without waiting for message persistence in DARQ, effectively eliding cloud persistence overheads; our scheme maintains CReSt’s fault-tolerance guarantees and automatically restores consistent system state upon failure. We showcase the generality of CReSt and DARQ using two applications: cloud streaming and workflow processing. Experiments show that DARQ is able to achieve extremely low latency and high throughput across these use cases, often beating state-of-the-art customized solutions. 2023-07-11T17:36:59Z 2023-07-11T17:36:59Z 2023-06-20 2023-07-01T08:00:03Z Article http://purl.org/eprint/type/JournalArticle 2836-6573 https://hdl.handle.net/1721.1/151085 Li, Tianyu, Chandramouli, Badrish, Burckhardt, Sebastian and Madden, Samuel. 2023. "DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps." Proceedings of the ACM on Management of Data, 1 (2). PUBLISHER_POLICY en https://doi.org/10.1145/3589262 Proceedings of the ACM on Management of Data Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. The author(s) application/pdf ACM Association for Computing Machinery
spellingShingle Li, Tianyu
Chandramouli, Badrish
Burckhardt, Sebastian
Madden, Samuel
DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title_full DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title_fullStr DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title_full_unstemmed DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title_short DARQ Matter Binds Everything: Performant and Composable Cloud Programming via Resilient Steps
title_sort darq matter binds everything performant and composable cloud programming via resilient steps
url https://hdl.handle.net/1721.1/151085
work_keys_str_mv AT litianyu darqmatterbindseverythingperformantandcomposablecloudprogrammingviaresilientsteps
AT chandramoulibadrish darqmatterbindseverythingperformantandcomposablecloudprogrammingviaresilientsteps
AT burckhardtsebastian darqmatterbindseverythingperformantandcomposablecloudprogrammingviaresilientsteps
AT maddensamuel darqmatterbindseverythingperformantandcomposablecloudprogrammingviaresilientsteps