Essays on Martech: Learning to Design, Deliver, and Diffuse Interventions

This dissertation consists of three chapters on leveraging machine learning to better design, deliver, and diffuse interventions with a focus on advertising and targeting. Chapter one develops an algorithm to predict the causal effect of influencer video advertising on product sales. A summary st...

Full description

Bibliographic Details
Main Author: Yang, Jeremy (Zhen)
Other Authors: Zhang, Juanjuan
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/138931
https://orcid.org/0000-0001-8639-5493
Description
Summary:This dissertation consists of three chapters on leveraging machine learning to better design, deliver, and diffuse interventions with a focus on advertising and targeting. Chapter one develops an algorithm to predict the causal effect of influencer video advertising on product sales. A summary statistic, motion-score, or m-score, is proposed to capture the extent to which a product is advertised in the most engaging parts of a video. Pixel-level product placement is located with an object detection algorithm and pixel-level engagement is estimated as a saliency map by fine-tuning a deep 3D convolutional neural network on video-level engagement data. M-score is then defined as pixel-level engagement-weighted advertising intensity of a video. The algorithm is constructed and evaluated with influencer video ads on TikTok. Causal effects of video ads on product sales are identified by exploiting variation in video posting time. Videos of higher m-score indeed lift more sales. This effect is sizable, robust, and more pronounced among impulsive, hedonic, or inexpensive products. The mechanism can be traced to influencers’ incentives to promote themselves rather than the product. How various stakeholders in entertainment commerce can use m-score in a scalable way to optimize content, align incentives, and improve efficiency are discussed. Chapter two proposes a method to optimize a targeting policy that maximizes an outcome observed only in the long term. Traditionally, this typically requires delaying decisions until the outcome is observed or relying on simple short-term proxies for the long-term outcome. The method builds on the statistical surrogacy and off-policy learning literature to first im- pute the missing long-term outcomes and then approximate the optimal targeting policy on the imputed outcomes via a doubly robust approach. It is applied in large-scale proactive churn management experiments at The Boston Globe by targeting optimal discounts to its digital subscribers to maximize their long-term revenue. It is shown that conditions for the validity of average treatment effect estimation with imputed outcomes are also sufficient for valid policy evaluation and optimization; furthermore, these conditions can be somewhat relaxed for policy optimization. The method is also validated empirically by comparing it with a policy learned on the ground truth long-term outcomes, they are shown to be statistically indistinguishable. It also outperforms a policy learned on short-term proxies for the long-term outcome. Chapter three explores how network embeddings can be applied to the study of diffusion. Two sets of questions are investigated using a combination of real and simulated datasets: First, can node embeddings predict adoption decisions better than standard centrality-based summary statistics? Second, can node embeddings be used as control variables to reduce the bias in peer effect estimation? Some initial results and future work are discussed.