Learning from Commerce Data: from Theory to Practice

Enterprises can anticipate substantial benefits from the vast potential of commerce data. However, deploying analytics platforms to extract value from such data poses a significant challenge for many organizations. One major obstacle lies in the ability to effectively learn from commerce data within...

Full description

Bibliographic Details
Main Author: Peng, Tianyi
Other Authors: Farias, Vivek F.
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/153093
Description
Summary:Enterprises can anticipate substantial benefits from the vast potential of commerce data. However, deploying analytics platforms to extract value from such data poses a significant challenge for many organizations. One major obstacle lies in the ability to effectively learn from commerce data within an environment characterized by noise, non-stationarity, intricate intervention patterns, and the occurrence of operational issues and anomalies at any given moment. Motivated by these challenges, we address the problem of causal learning in panels with general intervention patterns that may depend on historical data. In this thesis, we present a novel and nearly complete solution to this problem that allows for the rate-optimal recovery of treatment effects. Our work generalizes the outcome model of the difference-in-difference paradigm and expands the applicability of the synthetic-control paradigm. In doing so, we provide a novel de-biasing analysis that addresses the low-rank matrix regression with non-random intervention patterns and noise; a non-trivial feature of independent interest. Moreover, this thesis also addresses the challenges of anomaly detection and uncertainty quantification for low-rank matrices with missing entries and general noises, thus enabling robust learning from corrupted data. On the practical side, our algorithm forms the core of TestOps, a pioneering testing platform co-developed with a USD 100 billion beverage company. TestOps solves a long-standing problem of learning from physical retail experiments, leading to a substantial increase in revenue of millions of dollars per month in Mexico alone, and is being rolled out globally. Our framework has also sparked ongoing collaborations in healthcare, finance, and sustainability, extending its applicability beyond retailing. The outcomes have been consolidated and documented in an open-source Python package available at: https://github.com/TianyiPeng/causaltensor.