Maximal Correlation Feature Selection and Suppression With Applications

In standard supervised learning, we assume that we are trying to learn some target variable 𝑌 from some data 𝑋. However, many learning problems can be framed as supervised learning with an auxiliary objective, often associated with an auxiliary variable 𝐷 which defines this objective. Applying the p...

Full description

Bibliographic Details
Main Author: Lee, Joshua Ka-Wing
Other Authors: Wornell, Gregory W.
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140035
Description
Summary:In standard supervised learning, we assume that we are trying to learn some target variable 𝑌 from some data 𝑋. However, many learning problems can be framed as supervised learning with an auxiliary objective, often associated with an auxiliary variable 𝐷 which defines this objective. Applying the principles of Hirschfeld-Gebelein-Rényi (HGR) maximal correlation analysis reveals new insights as to how to formulate these learning problems with auxiliary objectives. We examine the use of the HGR in feature selection for multi-source transfer learning learning in the fewshot setting. We then apply HGR to the problem of feature suppression via enforcing marginal and conditional independence criteria with respect to a sensitive attribute, and illustrate the effectiveness of our methods to problems of fairness, privacy, and transfer learning. Finally, we explore the use of HGR in extracting features for outlier detection.