Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

In many learning settings, it is beneficial toaugment the main features with pairwise in-teractions. Such interaction models can beoften enhanced by performing variable selec-tion under the so-calledstrong hierarchycon-straint: an interaction is non-zero only if itsassociated main features...

Full description

Bibliographic Details
Main Authors:	Hazimeh, Hussein, Mazumder, Rahul
Other Authors:	Sloan School of Management
Format:	Article
Language:	English
Published:	International Machine Learning Society 2021
Online Access:	https://hdl.handle.net/1721.1/130384

Description
Summary:	In many learning settings, it is beneficial toaugment the main features with pairwise in-teractions. Such interaction models can beoften enhanced by performing variable selec-tion under the so-calledstrong hierarchycon-straint: an interaction is non-zero only if itsassociated main features are non-zero. Ex-isting convex optimization-based algorithmsface difficulties in handling problems wherethe number of main featuresp∼103(withtotal number of features∼p2). In this pa-per, we study a convex relaxation which en-forces strong hierarchy and develop a highlyscalable algorithm based on proximal gradi-ent descent. We introduce novel screeningrules that allow for solving the complicatedproximal problem in parallel. In addition,we introduce a specialized active-set strategywith gradient screening for avoiding costlygradient computations. The framework can handle problems having dense design matri-ces, withp= 50,000 (∼109interactions)—instances that are much larger than state ofthe art. Experiments on real and syntheticdata suggest that our toolkithierScaleout-performs the state of the art in terms of pre-diction and variable selection and can achieveover a 4900x speed-up.

Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

Similar Items