Investigation of machine learning tools for document clustering and classification

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.

Bibliographic Details
Main Author: Borodavkina, Lyudmila, 1977-
Other Authors: David R. Karger.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1721.1/8932
_version_ 1826207840105136128
author Borodavkina, Lyudmila, 1977-
author2 David R. Karger.
author_facet David R. Karger.
Borodavkina, Lyudmila, 1977-
author_sort Borodavkina, Lyudmila, 1977-
collection MIT
description Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
first_indexed 2024-09-23T13:55:46Z
format Thesis
id mit-1721.1/8932
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T13:55:46Z
publishDate 2005
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/89322019-04-10T23:48:06Z Investigation of machine learning tools for document clustering and classification Application of machine learning algorithms for document clustering and classification Borodavkina, Lyudmila, 1977- David R. Karger. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000. Includes bibliographical references (leaves 57-59). Data clustering is a problem of discovering the underlying data structure without any prior information about the data. The focus of this thesis is to evaluate a few of the modern clustering algorithms in order to determine their performance in adverse conditions. Synthetic Data Generation software is presented as a useful tool both for generating test data and for investigating results of the data clustering. Several theoretical models and their behavior are discussed, and, as the result of analysis of a large number of quantitative tests, we come up with a set of heuristics that describe the quality of clustering output in different adverse conditions. by Lyudmila Borodavkina. M.Eng. 2005-08-23T16:27:55Z 2005-08-23T16:27:55Z 2000 2000 Thesis http://hdl.handle.net/1721.1/8932 48981692 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 59 leaves 5253610 bytes 5253368 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Borodavkina, Lyudmila, 1977-
Investigation of machine learning tools for document clustering and classification
title Investigation of machine learning tools for document clustering and classification
title_full Investigation of machine learning tools for document clustering and classification
title_fullStr Investigation of machine learning tools for document clustering and classification
title_full_unstemmed Investigation of machine learning tools for document clustering and classification
title_short Investigation of machine learning tools for document clustering and classification
title_sort investigation of machine learning tools for document clustering and classification
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/8932
work_keys_str_mv AT borodavkinalyudmila1977 investigationofmachinelearningtoolsfordocumentclusteringandclassification
AT borodavkinalyudmila1977 applicationofmachinelearningalgorithmsfordocumentclusteringandclassification