In-database learning with sparse tensors

In-database analytics is of great practical importance as it avoids the costly repeated loop data scientists have to deal with on a daily basis: select features, export the data, convert data format, train models using an external tool, reimport the parameters. It is also a fertile ground of theo...

Full description

Bibliographic Details
Main Authors: Abo Khamis, M, Ngo, H, Nguyen, X, Olteanu, D, Schleich, M
Format: Conference item
Published: Association for Computing Machinery 2018
_version_ 1797060653650804736
author Abo Khamis, M
Ngo, H
Nguyen, X
Olteanu, D
Schleich, M
author_facet Abo Khamis, M
Ngo, H
Nguyen, X
Olteanu, D
Schleich, M
author_sort Abo Khamis, M
collection OXFORD
description In-database analytics is of great practical importance as it avoids the costly repeated loop data scientists have to deal with on a daily basis: select features, export the data, convert data format, train models using an external tool, reimport the parameters. It is also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This paper introduces a unified framework for training and evaluating a class of statistical learning models inside a relational database. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from relational database theory such as schema information, query structure, recent advances in query evaluation algorithms, and from linear algebra such as various tensor and matrix operations, one can formulate in-database learning problems and design efficient algorithms to solve them. The algorithms and models proposed in the paper have already been implemented inside the LogicBlox database engine and used in retail-planning and forecasting applications, with significant performance benefits over out-of-database solutions that require the costly data-export loop.
first_indexed 2024-03-06T20:20:08Z
format Conference item
id oxford-uuid:2d852e0d-889d-46fe-890e-b1ac5687c798
institution University of Oxford
last_indexed 2024-03-06T20:20:08Z
publishDate 2018
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:2d852e0d-889d-46fe-890e-b1ac5687c7982022-03-26T12:43:25ZIn-database learning with sparse tensorsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:2d852e0d-889d-46fe-890e-b1ac5687c798Symplectic Elements at OxfordAssociation for Computing Machinery2018Abo Khamis, MNgo, HNguyen, XOlteanu, DSchleich, MIn-database analytics is of great practical importance as it avoids the costly repeated loop data scientists have to deal with on a daily basis: select features, export the data, convert data format, train models using an external tool, reimport the parameters. It is also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This paper introduces a unified framework for training and evaluating a class of statistical learning models inside a relational database. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from relational database theory such as schema information, query structure, recent advances in query evaluation algorithms, and from linear algebra such as various tensor and matrix operations, one can formulate in-database learning problems and design efficient algorithms to solve them. The algorithms and models proposed in the paper have already been implemented inside the LogicBlox database engine and used in retail-planning and forecasting applications, with significant performance benefits over out-of-database solutions that require the costly data-export loop.
spellingShingle Abo Khamis, M
Ngo, H
Nguyen, X
Olteanu, D
Schleich, M
In-database learning with sparse tensors
title In-database learning with sparse tensors
title_full In-database learning with sparse tensors
title_fullStr In-database learning with sparse tensors
title_full_unstemmed In-database learning with sparse tensors
title_short In-database learning with sparse tensors
title_sort in database learning with sparse tensors
work_keys_str_mv AT abokhamism indatabaselearningwithsparsetensors
AT ngoh indatabaselearningwithsparsetensors
AT nguyenx indatabaselearningwithsparsetensors
AT olteanud indatabaselearningwithsparsetensors
AT schleichm indatabaselearningwithsparsetensors