Meta Dynamic Pricing: Transfer Learning Across Experiments
<jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Institute for Operations Research and the Management Sciences (INFORMS)
2023
|
Online Access: | https://hdl.handle.net/1721.1/148654 |
_version_ | 1826188972250890240 |
---|---|
author | Bastani, Hamsa Simchi-Levi, David Zhu, Ruihao |
author2 | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering |
author_facet | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Bastani, Hamsa Simchi-Levi, David Zhu, Ruihao |
author_sort | Bastani, Hamsa |
collection | MIT |
description | <jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation) and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the estimated prior as a function of its estimation error. We introduce a novel prior alignment technique to analyze the regret of Thompson sampling with a misspecified prior, which may be of independent interest. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N, demonstrating that the price of an unknown prior in Thompson sampling can be negligible in experiment-rich environments (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared with prior-independent algorithms. </jats:p><jats:p> This paper was accepted by George J. Shanthikumar, Management Science Special Section on Data-Driven Prescriptive Analytics. </jats:p> |
first_indexed | 2024-09-23T08:07:51Z |
format | Article |
id | mit-1721.1/148654 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T08:07:51Z |
publishDate | 2023 |
publisher | Institute for Operations Research and the Management Sciences (INFORMS) |
record_format | dspace |
spelling | mit-1721.1/1486542023-03-22T03:54:28Z Meta Dynamic Pricing: Transfer Learning Across Experiments Bastani, Hamsa Simchi-Levi, David Zhu, Ruihao Massachusetts Institute of Technology. Department of Civil and Environmental Engineering <jats:p> We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation) and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the estimated prior as a function of its estimation error. We introduce a novel prior alignment technique to analyze the regret of Thompson sampling with a misspecified prior, which may be of independent interest. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N, demonstrating that the price of an unknown prior in Thompson sampling can be negligible in experiment-rich environments (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared with prior-independent algorithms. </jats:p><jats:p> This paper was accepted by George J. Shanthikumar, Management Science Special Section on Data-Driven Prescriptive Analytics. </jats:p> 2023-03-21T17:11:27Z 2023-03-21T17:11:27Z 2022 2023-03-21T17:08:49Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/148654 Bastani, Hamsa, Simchi-Levi, David and Zhu, Ruihao. 2022. "Meta Dynamic Pricing: Transfer Learning Across Experiments." Management Science, 68 (3). en 10.1287/MNSC.2021.4071 Management Science Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute for Operations Research and the Management Sciences (INFORMS) SSRN |
spellingShingle | Bastani, Hamsa Simchi-Levi, David Zhu, Ruihao Meta Dynamic Pricing: Transfer Learning Across Experiments |
title | Meta Dynamic Pricing: Transfer Learning Across Experiments |
title_full | Meta Dynamic Pricing: Transfer Learning Across Experiments |
title_fullStr | Meta Dynamic Pricing: Transfer Learning Across Experiments |
title_full_unstemmed | Meta Dynamic Pricing: Transfer Learning Across Experiments |
title_short | Meta Dynamic Pricing: Transfer Learning Across Experiments |
title_sort | meta dynamic pricing transfer learning across experiments |
url | https://hdl.handle.net/1721.1/148654 |
work_keys_str_mv | AT bastanihamsa metadynamicpricingtransferlearningacrossexperiments AT simchilevidavid metadynamicpricingtransferlearningacrossexperiments AT zhuruihao metadynamicpricingtransferlearningacrossexperiments |