Integer optimization in data mining
Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2003.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2005
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/17579 |
_version_ | 1826207566903902208 |
---|---|
author | Shioda, Romy, 1977- |
author2 | Dimitris Bertsimas. |
author_facet | Dimitris Bertsimas. Shioda, Romy, 1977- |
author_sort | Shioda, Romy, 1977- |
collection | MIT |
description | Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2003. |
first_indexed | 2024-09-23T13:51:34Z |
format | Thesis |
id | mit-1721.1/17579 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T13:51:34Z |
publishDate | 2005 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/175792019-04-11T09:23:32Z Integer optimization in data mining Data mining via integer optimization Shioda, Romy, 1977- Dimitris Bertsimas. Massachusetts Institute of Technology. Operations Research Center. Massachusetts Institute of Technology. Operations Research Center. Operations Research Center. Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2003. Includes bibliographical references (p. 103-107). While continuous optimization methods have been widely used in statistics and data mining over the last thirty years, integer optimization has had very limited impact in statistical computation. Thus, our objective is to develop a methodology utilizing state of the art integer optimization methods to exploit the discrete character of data mining problems. The thesis consists of two parts: The first part illustrates a mixed-integer optimization method for classification and regression that we call Classification and Regression via Integer Optimization (CRIO). CRIO separates data points in different polyhedral regions. In classification each region is assigned a class, while in regression each region has its own distinct regression coefficients. Computational experimentation with real data sets shows that CRIO is comparable to and often outperforms the current leading methods in classification and regression. The second part describes our cardinality-constrained quadratic mixed-integer optimization algorithm, used to solve subset selection in regression and portfolio selection in asset allocation. We take advantage of the special structures of these problems by implementing a combination of implicit branch-and-bound, Lemke's pivoting method, variable deletion and problem reformulation. Testing against popular heuristic methods and CPLEX 8.0's quadratic mixed-integer solver, we see that our tailored approach to these quadratic variable selection problems have significant advantages over simple heuristics and generalized solvers. by Romy Shioda. Ph.D. 2005-06-02T16:15:35Z 2005-06-02T16:15:35Z 2003 2003 Thesis http://hdl.handle.net/1721.1/17579 53010913 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 107 p. 3637458 bytes 3637264 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology |
spellingShingle | Operations Research Center. Shioda, Romy, 1977- Integer optimization in data mining |
title | Integer optimization in data mining |
title_full | Integer optimization in data mining |
title_fullStr | Integer optimization in data mining |
title_full_unstemmed | Integer optimization in data mining |
title_short | Integer optimization in data mining |
title_sort | integer optimization in data mining |
topic | Operations Research Center. |
url | http://hdl.handle.net/1721.1/17579 |
work_keys_str_mv | AT shiodaromy1977 integeroptimizationindatamining AT shiodaromy1977 dataminingviaintegeroptimization |