A Simple Scheme for Book Classification Using Wikipedia

Because the rate at which documents are being generated outstrips librarians’ ability to catalog them, an accurate, automated scheme of subject classification is desirable. However, simplistic word-counting schemes miss many important concepts; librarians must enrich algorithms with background knowl...

Full description

Bibliographic Details
Main Author: Andromeda Yelton
Format: Article
Language:English
Published: American Library Association 2011-03-01
Series:Information Technology and Libraries
Online Access:https://ejournals.bc.edu/ojs/index.php/ital/article/view/3040
_version_ 1818136928939147264
author Andromeda Yelton
author_facet Andromeda Yelton
author_sort Andromeda Yelton
collection DOAJ
description Because the rate at which documents are being generated outstrips librarians’ ability to catalog them, an accurate, automated scheme of subject classification is desirable. However, simplistic word-counting schemes miss many important concepts; librarians must enrich algorithms with background knowledge to escape basic problems such as polysemy and synonymy. I have developed a script that uses Wikipedia as context for analyzing the subjects of nonfiction books. Though a simple method built quickly from freely available parts, it is partially successful, suggesting the promise of such an approach for future research.
first_indexed 2024-12-11T09:48:12Z
format Article
id doaj.art-a74a1fa43ce447cfbec680691458f117
institution Directory Open Access Journal
issn 0730-9295
2163-5226
language English
last_indexed 2024-12-11T09:48:12Z
publishDate 2011-03-01
publisher American Library Association
record_format Article
series Information Technology and Libraries
spelling doaj.art-a74a1fa43ce447cfbec680691458f1172022-12-22T01:12:30ZengAmerican Library AssociationInformation Technology and Libraries0730-92952163-52262011-03-0130171510.6017/ital.v30i1.30402709A Simple Scheme for Book Classification Using WikipediaAndromeda YeltonBecause the rate at which documents are being generated outstrips librarians’ ability to catalog them, an accurate, automated scheme of subject classification is desirable. However, simplistic word-counting schemes miss many important concepts; librarians must enrich algorithms with background knowledge to escape basic problems such as polysemy and synonymy. I have developed a script that uses Wikipedia as context for analyzing the subjects of nonfiction books. Though a simple method built quickly from freely available parts, it is partially successful, suggesting the promise of such an approach for future research.https://ejournals.bc.edu/ojs/index.php/ital/article/view/3040
spellingShingle Andromeda Yelton
A Simple Scheme for Book Classification Using Wikipedia
Information Technology and Libraries
title A Simple Scheme for Book Classification Using Wikipedia
title_full A Simple Scheme for Book Classification Using Wikipedia
title_fullStr A Simple Scheme for Book Classification Using Wikipedia
title_full_unstemmed A Simple Scheme for Book Classification Using Wikipedia
title_short A Simple Scheme for Book Classification Using Wikipedia
title_sort simple scheme for book classification using wikipedia
url https://ejournals.bc.edu/ojs/index.php/ital/article/view/3040
work_keys_str_mv AT andromedayelton asimpleschemeforbookclassificationusingwikipedia
AT andromedayelton simpleschemeforbookclassificationusingwikipedia