Learning efficiently with approximate inference via dual losses

Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron)...

Full description

Bibliographic Details
Main Authors:	Meshi, Ofer, Sontag, David Alexander, Jaakkola, Tommi S., Globerson, Amir
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	International Machine Learning Society 2011
Online Access:	http://hdl.handle.net/1721.1/62851 https://orcid.org/0000-0002-2199-0379

_version_	1826189391688630272
author	Meshi, Ofer Sontag, David Alexander Jaakkola, Tommi S. Globerson, Amir
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Meshi, Ofer Sontag, David Alexander Jaakkola, Tommi S. Globerson, Amir
author_sort	Meshi, Ofer
collection	MIT
description	Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane.
first_indexed	2024-09-23T08:13:58Z
format	Article
id	mit-1721.1/62851
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T08:13:58Z
publishDate	2011
publisher	International Machine Learning Society
record_format	dspace
spelling	mit-1721.1/628512022-09-23T11:48:28Z Learning efficiently with approximate inference via dual losses Meshi, Ofer Sontag, David Alexander Jaakkola, Tommi S. Globerson, Amir Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Jaakkola, Tommi S. Sontag, David Alexander Jaakkola, Tommi S. Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane. 2011-05-19T21:39:04Z 2011-05-19T21:39:04Z 2010-01 Article http://purl.org/eprint/type/ConferencePaper 9781605589077 1605589071 http://hdl.handle.net/1721.1/62851 Meshi, Ofer et al. "Learning Efficiently with Approximate Inference via Dual Losses" Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010. https://orcid.org/0000-0002-2199-0379 en_US http://www.icml2010.org/papers/587.pdf International Conference on Machine Learning (27th, 2010) proceedings Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf International Machine Learning Society MIT web domain
spellingShingle	Meshi, Ofer Sontag, David Alexander Jaakkola, Tommi S. Globerson, Amir Learning efficiently with approximate inference via dual losses
title	Learning efficiently with approximate inference via dual losses
title_full	Learning efficiently with approximate inference via dual losses
title_fullStr	Learning efficiently with approximate inference via dual losses
title_full_unstemmed	Learning efficiently with approximate inference via dual losses
title_short	Learning efficiently with approximate inference via dual losses
title_sort	learning efficiently with approximate inference via dual losses
url	http://hdl.handle.net/1721.1/62851 https://orcid.org/0000-0002-2199-0379
work_keys_str_mv	AT meshiofer learningefficientlywithapproximateinferenceviaduallosses AT sontagdavidalexander learningefficientlywithapproximateinferenceviaduallosses AT jaakkolatommis learningefficientlywithapproximateinferenceviaduallosses AT globersonamir learningefficientlywithapproximateinferenceviaduallosses

Learning efficiently with approximate inference via dual losses

Similar Items