Meta Dynamic Pricing: Transfer Learning Across Experiments

<jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products...

Full description

Bibliographic Details
Main Authors: Bastani, Hamsa, Simchi-Levi, David, Zhu, Ruihao
Other Authors: Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Format: Article
Language:English
Published: Institute for Operations Research and the Management Sciences (INFORMS) 2023
Online Access:https://hdl.handle.net/1721.1/148654
_version_ 1826188972250890240
author Bastani, Hamsa
Simchi-Levi, David
Zhu, Ruihao
author2 Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
author_facet Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Bastani, Hamsa
Simchi-Levi, David
Zhu, Ruihao
author_sort Bastani, Hamsa
collection MIT
description <jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation) and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the estimated prior as a function of its estimation error. We introduce a novel prior alignment technique to analyze the regret of Thompson sampling with a misspecified prior, which may be of independent interest. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N, demonstrating that the price of an unknown prior in Thompson sampling can be negligible in experiment-rich environments (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared with prior-independent algorithms. </jats:p><jats:p> This paper was accepted by George J. Shanthikumar, Management Science Special Section on Data-Driven Prescriptive Analytics. </jats:p>
first_indexed 2024-09-23T08:07:51Z
format Article
id mit-1721.1/148654
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T08:07:51Z
publishDate 2023
publisher Institute for Operations Research and the Management Sciences (INFORMS)
record_format dspace
spelling mit-1721.1/1486542023-03-22T03:54:28Z Meta Dynamic Pricing: Transfer Learning Across Experiments Bastani, Hamsa Simchi-Levi, David Zhu, Ruihao Massachusetts Institute of Technology. Department of Civil and Environmental Engineering <jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation) and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the estimated prior as a function of its estimation error. We introduce a novel prior alignment technique to analyze the regret of Thompson sampling with a misspecified prior, which may be of independent interest. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N, demonstrating that the price of an unknown prior in Thompson sampling can be negligible in experiment-rich environments (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared with prior-independent algorithms. </jats:p><jats:p> This paper was accepted by George J. Shanthikumar, Management Science Special Section on Data-Driven Prescriptive Analytics. </jats:p> 2023-03-21T17:11:27Z 2023-03-21T17:11:27Z 2022 2023-03-21T17:08:49Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/148654 Bastani, Hamsa, Simchi-Levi, David and Zhu, Ruihao. 2022. "Meta Dynamic Pricing: Transfer Learning Across Experiments." Management Science, 68 (3). en 10.1287/MNSC.2021.4071 Management Science Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute for Operations Research and the Management Sciences (INFORMS) SSRN
spellingShingle Bastani, Hamsa
Simchi-Levi, David
Zhu, Ruihao
Meta Dynamic Pricing: Transfer Learning Across Experiments
title Meta Dynamic Pricing: Transfer Learning Across Experiments
title_full Meta Dynamic Pricing: Transfer Learning Across Experiments
title_fullStr Meta Dynamic Pricing: Transfer Learning Across Experiments
title_full_unstemmed Meta Dynamic Pricing: Transfer Learning Across Experiments
title_short Meta Dynamic Pricing: Transfer Learning Across Experiments
title_sort meta dynamic pricing transfer learning across experiments
url https://hdl.handle.net/1721.1/148654
work_keys_str_mv AT bastanihamsa metadynamicpricingtransferlearningacrossexperiments
AT simchilevidavid metadynamicpricingtransferlearningacrossexperiments
AT zhuruihao metadynamicpricingtransferlearningacrossexperiments