Tuplex: robust, efficient analytics when Python rules

© 2019 VLDB Endowment. Spark became the defacto industry standard as an execution engine for data preparation, cleaning, distributed machine learning, streaming and, warehousing over raw data. However, with the success of Python the landscape is shifting again; there is a strong demand for tools whi...

Full description

Bibliographic Details
Main Authors:	Spiegelberg, Leonhard F, Kraska, Tim
Format:	Article
Language:	English
Published:	VLDB Endowment 2021
Online Access:	https://hdl.handle.net/1721.1/132284

_version_	1826205377007452160
author	Spiegelberg, Leonhard F Kraska, Tim
author_facet	Spiegelberg, Leonhard F Kraska, Tim
author_sort	Spiegelberg, Leonhard F
collection	MIT
description	© 2019 VLDB Endowment. Spark became the defacto industry standard as an execution engine for data preparation, cleaning, distributed machine learning, streaming and, warehousing over raw data. However, with the success of Python the landscape is shifting again; there is a strong demand for tools which better integrate with the Python landscape and do not have the impedance mismatch like Spark. In this paper, we demonstrate Tuplex (short for tuples and exceptions), a Pythonnative data preparation framework that allows users to develop and deploy pipelines faster and more robustly while providing bare-metal execution times through code compilation whenever possible.
first_indexed	2024-09-23T13:12:01Z
format	Article
id	mit-1721.1/132284
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T13:12:01Z
publishDate	2021
publisher	VLDB Endowment
record_format	dspace
spelling	mit-1721.1/1322842021-09-21T03:31:07Z Tuplex: robust, efficient analytics when Python rules Spiegelberg, Leonhard F Kraska, Tim © 2019 VLDB Endowment. Spark became the defacto industry standard as an execution engine for data preparation, cleaning, distributed machine learning, streaming and, warehousing over raw data. However, with the success of Python the landscape is shifting again; there is a strong demand for tools which better integrate with the Python landscape and do not have the impedance mismatch like Spark. In this paper, we demonstrate Tuplex (short for tuples and exceptions), a Pythonnative data preparation framework that allows users to develop and deploy pipelines faster and more robustly while providing bare-metal execution times through code compilation whenever possible. 2021-09-20T18:21:39Z 2021-09-20T18:21:39Z 2021-01-11T16:52:56Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/132284 en 10.14778/3352063.3352109 Proceedings of the VLDB Endowment Creative Commons Attribution-NonCommercial-NoDerivs License http://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf VLDB Endowment VLDB Endowment
spellingShingle	Spiegelberg, Leonhard F Kraska, Tim Tuplex: robust, efficient analytics when Python rules
title	Tuplex: robust, efficient analytics when Python rules
title_full	Tuplex: robust, efficient analytics when Python rules
title_fullStr	Tuplex: robust, efficient analytics when Python rules
title_full_unstemmed	Tuplex: robust, efficient analytics when Python rules
title_short	Tuplex: robust, efficient analytics when Python rules
title_sort	tuplex robust efficient analytics when python rules
url	https://hdl.handle.net/1721.1/132284
work_keys_str_mv	AT spiegelbergleonhardf tuplexrobustefficientanalyticswhenpythonrules AT kraskatim tuplexrobustefficientanalyticswhenpythonrules

Tuplex: robust, efficient analytics when Python rules

Similar Items