Summary: | When the test distribution differs from the training distribution, machine learning models can perform poorly and wrongly overestimate their performance. In this work, we aim to better estimate the model’s performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels, where the error of this estimation is bounded by the target risk of the proxy model. Therefore, we study the generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our estimation approach can self-tune to find the optimal model complexity and the resulting models achieve good target generalization, and estimate target error of other models well. Applications of our results include model selection, deciding early stopping, error detection, and predicting the adaptability of a model between domains.
|