Learning Preferences with Side Information

Product and content personalization is now ubiquitous in e-commerce. There are typically not enough available transactional data for this task. As such, companies today seek to use a variety of information on the interactions between a product and a customer to drive personalization decisions. We fo...

Full description

Bibliographic Details
Main Authors: Farias, Vivek F, Li, Andrew A
Other Authors: Sloan School of Management
Format: Article
Language:English
Published: Institute for Operations Research and the Management Sciences (INFORMS) 2021
Online Access:https://hdl.handle.net/1721.1/136395
Description
Summary:Product and content personalization is now ubiquitous in e-commerce. There are typically not enough available transactional data for this task. As such, companies today seek to use a variety of information on the interactions between a product and a customer to drive personalization decisions. We formalize this problem as one of recovering a large-scale matrix with side information in the form of additional matrices of conforming dimension. Viewing the matrix we seek to recover and the side information we have as slices of a tensor, we consider the problem of slice recovery, which is to recover specific slices of “simple” tensors from noisy observations of the entire tensor. We propose a definition of simplicity that on the one hand elegantly generalizes a standard generative model for our motivating problem and on the other hand subsumes low-rank tensors for a variety of existing definitions of tensor rank. We provide an efficient algorithm for slice recovery that is practical for massive data sets and provides a significant performance improvement over state-of-the-art incumbent approaches to tensor recovery. Furthermore, we establish near-optimal recovery guarantees that, in an important regime, represent an order improvement over the best available results for this problem. Experiments on data from a music streaming service demonstrate the performance and scalability of our algorithm.