Investigation of machine learning tools for document clustering and classification
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2005
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/8932 |
_version_ | 1826207840105136128 |
---|---|
author | Borodavkina, Lyudmila, 1977- |
author2 | David R. Karger. |
author_facet | David R. Karger. Borodavkina, Lyudmila, 1977- |
author_sort | Borodavkina, Lyudmila, 1977- |
collection | MIT |
description | Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000. |
first_indexed | 2024-09-23T13:55:46Z |
format | Thesis |
id | mit-1721.1/8932 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T13:55:46Z |
publishDate | 2005 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/89322019-04-10T23:48:06Z Investigation of machine learning tools for document clustering and classification Application of machine learning algorithms for document clustering and classification Borodavkina, Lyudmila, 1977- David R. Karger. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000. Includes bibliographical references (leaves 57-59). Data clustering is a problem of discovering the underlying data structure without any prior information about the data. The focus of this thesis is to evaluate a few of the modern clustering algorithms in order to determine their performance in adverse conditions. Synthetic Data Generation software is presented as a useful tool both for generating test data and for investigating results of the data clustering. Several theoretical models and their behavior are discussed, and, as the result of analysis of a large number of quantitative tests, we come up with a set of heuristics that describe the quality of clustering output in different adverse conditions. by Lyudmila Borodavkina. M.Eng. 2005-08-23T16:27:55Z 2005-08-23T16:27:55Z 2000 2000 Thesis http://hdl.handle.net/1721.1/8932 48981692 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 59 leaves 5253610 bytes 5253368 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Borodavkina, Lyudmila, 1977- Investigation of machine learning tools for document clustering and classification |
title | Investigation of machine learning tools for document clustering and classification |
title_full | Investigation of machine learning tools for document clustering and classification |
title_fullStr | Investigation of machine learning tools for document clustering and classification |
title_full_unstemmed | Investigation of machine learning tools for document clustering and classification |
title_short | Investigation of machine learning tools for document clustering and classification |
title_sort | investigation of machine learning tools for document clustering and classification |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/8932 |
work_keys_str_mv | AT borodavkinalyudmila1977 investigationofmachinelearningtoolsfordocumentclusteringandclassification AT borodavkinalyudmila1977 applicationofmachinelearningalgorithmsfordocumentclusteringandclassification |