Application of TF-IDF factor in the semantic analysis of a documentary collection

<strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for t...

Full description

Bibliographic Details
Main Authors: Andrés Vuotto, Celeste Bogetti, Gladys Fernández
Format: Article
Language:Spanish
Published: University Library System, University of Pittsburgh 2015-11-01
Series:Biblios
Subjects:
Online Access:http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227
_version_ 1818526896611131392
author Andrés Vuotto
Celeste Bogetti
Gladys Fernández
author_facet Andrés Vuotto
Celeste Bogetti
Gladys Fernández
author_sort Andrés Vuotto
collection DOAJ
description <strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for the management of a thesaurus, the calculation of TF – IDF (as an indicator of semantic weight) and for development a relevance tree (consisting of those concepts is developed most relevant issue analyzed). The tool was tested to the semantic analysis of a documentary collection of Psychology. <strong>Results</strong>. The system was able to identify the level of track presence: professional ethics, in a collection of documents Psychology program. <strong>Conclusions</strong>. The experience described confirms the viability of the tool for the semantic analysis of a documentary collection. It underlines the relevance and capacities of information professionals to develop this kind of tools for processing information. The authors suggests a special technical approach for use of scripts and information flows.
first_indexed 2024-12-11T06:29:10Z
format Article
id doaj.art-d1069febd04641b78766ca7c11ade8e7
institution Directory Open Access Journal
issn 1562-4730
language Spanish
last_indexed 2024-12-11T06:29:10Z
publishDate 2015-11-01
publisher University Library System, University of Pittsburgh
record_format Article
series Biblios
spelling doaj.art-d1069febd04641b78766ca7c11ade8e72022-12-22T01:17:35ZspaUniversity Library System, University of PittsburghBiblios1562-47302015-11-0106011310.5195/biblios.2015.227148Application of TF-IDF factor in the semantic analysis of a documentary collectionAndrés Vuotto0Celeste Bogetti1Gladys Fernández2Universidad Nacional de Mar del Plata - MDPUniversidad Nacional de Mar del Plata - MDPUniversidad Nacional de Mar del Plata - MDP<strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for the management of a thesaurus, the calculation of TF – IDF (as an indicator of semantic weight) and for development a relevance tree (consisting of those concepts is developed most relevant issue analyzed). The tool was tested to the semantic analysis of a documentary collection of Psychology. <strong>Results</strong>. The system was able to identify the level of track presence: professional ethics, in a collection of documents Psychology program. <strong>Conclusions</strong>. The experience described confirms the viability of the tool for the semantic analysis of a documentary collection. It underlines the relevance and capacities of information professionals to develop this kind of tools for processing information. The authors suggests a special technical approach for use of scripts and information flows.http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227Análisis semánticoTF-IDFRecuperación de informaciónMinería de datosExtracción de información en bases de datos
spellingShingle Andrés Vuotto
Celeste Bogetti
Gladys Fernández
Application of TF-IDF factor in the semantic analysis of a documentary collection
Biblios
Análisis semántico
TF-IDF
Recuperación de información
Minería de datos
Extracción de información en bases de datos
title Application of TF-IDF factor in the semantic analysis of a documentary collection
title_full Application of TF-IDF factor in the semantic analysis of a documentary collection
title_fullStr Application of TF-IDF factor in the semantic analysis of a documentary collection
title_full_unstemmed Application of TF-IDF factor in the semantic analysis of a documentary collection
title_short Application of TF-IDF factor in the semantic analysis of a documentary collection
title_sort application of tf idf factor in the semantic analysis of a documentary collection
topic Análisis semántico
TF-IDF
Recuperación de información
Minería de datos
Extracción de información en bases de datos
url http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227
work_keys_str_mv AT andresvuotto applicationoftfidffactorinthesemanticanalysisofadocumentarycollection
AT celestebogetti applicationoftfidffactorinthesemanticanalysisofadocumentarycollection
AT gladysfernandez applicationoftfidffactorinthesemanticanalysisofadocumentarycollection