AutoML Approach to Stock Keeping Units Segmentation

A typical retailer carries 10,000 stock-keeping units (SKUs). However, these numbers may exceed hundreds of millions for giants such as Walmart and Amazon. Besides the volume, SKU data can also be high-dimensional, which means that SKUs can be segmented on the basis of various attributes. Given the...

Full description

Bibliographic Details
Main Author:	Jackson, Ilya
Other Authors:	Massachusetts Institute of Technology. Center for Transportation & Logistics
Format:	Article
Published:	Multidisciplinary Digital Publishing Institute 2022
Online Access:	https://hdl.handle.net/1721.1/146617

_version_	1826217205065318400
author	Jackson, Ilya
author2	Massachusetts Institute of Technology. Center for Transportation & Logistics
author_facet	Massachusetts Institute of Technology. Center for Transportation & Logistics Jackson, Ilya
author_sort	Jackson, Ilya
collection	MIT
description	A typical retailer carries 10,000 stock-keeping units (SKUs). However, these numbers may exceed hundreds of millions for giants such as Walmart and Amazon. Besides the volume, SKU data can also be high-dimensional, which means that SKUs can be segmented on the basis of various attributes. Given the data volumes and the multitude of potentially important dimensions to consider, it becomes computationally impossible to individually manage each SKU. Even though the application of clustering for SKU segmentation is common, previous studies do not address the problem of parametrization and model finetuning, which may be extremely tedious and time-consuming in real-world applications. Our work closes the research gap by proposing a solution that leverages automated machine learning for the automated cluster analysis of SKUs. The proposed framework for automated SKU segmentation incorporates minibatch K-means clustering, principal component analysis, and grid search for parameter tuning. It operates on top of the Apache Parquet file format, an efficient, structured, compressed, column-oriented, and big-data-friendly format. The proposed solution was tested on the basis of a real-world dataset that contained data at the pallet level.
first_indexed	2024-09-23T16:59:42Z
format	Article
id	mit-1721.1/146617
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T16:59:42Z
publishDate	2022
publisher	Multidisciplinary Digital Publishing Institute
record_format	dspace
spelling	mit-1721.1/1466172023-02-16T16:23:14Z AutoML Approach to Stock Keeping Units Segmentation Jackson, Ilya Massachusetts Institute of Technology. Center for Transportation & Logistics A typical retailer carries 10,000 stock-keeping units (SKUs). However, these numbers may exceed hundreds of millions for giants such as Walmart and Amazon. Besides the volume, SKU data can also be high-dimensional, which means that SKUs can be segmented on the basis of various attributes. Given the data volumes and the multitude of potentially important dimensions to consider, it becomes computationally impossible to individually manage each SKU. Even though the application of clustering for SKU segmentation is common, previous studies do not address the problem of parametrization and model finetuning, which may be extremely tedious and time-consuming in real-world applications. Our work closes the research gap by proposing a solution that leverages automated machine learning for the automated cluster analysis of SKUs. The proposed framework for automated SKU segmentation incorporates minibatch K-means clustering, principal component analysis, and grid search for parameter tuning. It operates on top of the Apache Parquet file format, an efficient, structured, compressed, column-oriented, and big-data-friendly format. The proposed solution was tested on the basis of a real-world dataset that contained data at the pallet level. 2022-11-28T14:31:39Z 2022-11-28T14:31:39Z 2022-11-15 2022-11-24T14:43:06Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/146617 Journal of Theoretical and Applied Electronic Commerce Research 17 (4): 1512-1528 (2022) PUBLISHER_CC http://dx.doi.org/10.3390/jtaer17040076 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ application/pdf Multidisciplinary Digital Publishing Institute Multidisciplinary Digital Publishing Institute
spellingShingle	Jackson, Ilya AutoML Approach to Stock Keeping Units Segmentation
title	AutoML Approach to Stock Keeping Units Segmentation
title_full	AutoML Approach to Stock Keeping Units Segmentation
title_fullStr	AutoML Approach to Stock Keeping Units Segmentation
title_full_unstemmed	AutoML Approach to Stock Keeping Units Segmentation
title_short	AutoML Approach to Stock Keeping Units Segmentation
title_sort	automl approach to stock keeping units segmentation
url	https://hdl.handle.net/1721.1/146617
work_keys_str_mv	AT jacksonilya automlapproachtostockkeepingunitssegmentation

AutoML Approach to Stock Keeping Units Segmentation

Similar Items