Model Compression and AutoML for Efficient Click-Through Rate Prediction

Novel machine learning architectures can adeptly learn to predict user response for recommender systems. However, these model architectures are often effective at the cost of large computational, and memory, cost. This limits their ability to run on edge devices with smaller hardwares, such as smart...

Full description

Bibliographic Details
Main Author: Gschwind, Katharina
Other Authors: Han, Song
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139253
Description
Summary:Novel machine learning architectures can adeptly learn to predict user response for recommender systems. However, these model architectures are often effective at the cost of large computational, and memory, cost. This limits their ability to run on edge devices with smaller hardwares, such as smartphones, which is a popular use case for recommender systems. We address this issue in this thesis by studying how compression of recommender system models can significantly reduce model computation cost, and edge device runtime, while preserving prediction accuracy. Furthermore, we present a new compression-based AutoML method for feature set generation in architectures which incorporate explicit feature interactions. This works as a tool to build efficient recommender system models, and is applicable to many state of the art model designs. Applying this AutoML shows initial gains in model performance.