Application of TF-IDF factor in the semantic analysis of a documentary collection
<strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for t...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | Spanish |
Published: |
University Library System, University of Pittsburgh
2015-11-01
|
Series: | Biblios |
Subjects: | |
Online Access: | http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227 |
_version_ | 1818526896611131392 |
---|---|
author | Andrés Vuotto Celeste Bogetti Gladys Fernández |
author_facet | Andrés Vuotto Celeste Bogetti Gladys Fernández |
author_sort | Andrés Vuotto |
collection | DOAJ |
description | <strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for the management of a thesaurus, the calculation of TF – IDF (as an indicator of semantic weight) and for development a relevance tree (consisting of those concepts is developed most relevant issue analyzed). The tool was tested to the semantic analysis of a documentary collection of Psychology. <strong>Results</strong>. The system was able to identify the level of track presence: professional ethics, in a collection of documents Psychology program. <strong>Conclusions</strong>. The experience described confirms the viability of the tool for the semantic analysis of a documentary collection. It underlines the relevance and capacities of information professionals to develop this kind of tools for processing information. The authors suggests a special technical approach for use of scripts and information flows. |
first_indexed | 2024-12-11T06:29:10Z |
format | Article |
id | doaj.art-d1069febd04641b78766ca7c11ade8e7 |
institution | Directory Open Access Journal |
issn | 1562-4730 |
language | Spanish |
last_indexed | 2024-12-11T06:29:10Z |
publishDate | 2015-11-01 |
publisher | University Library System, University of Pittsburgh |
record_format | Article |
series | Biblios |
spelling | doaj.art-d1069febd04641b78766ca7c11ade8e72022-12-22T01:17:35ZspaUniversity Library System, University of PittsburghBiblios1562-47302015-11-0106011310.5195/biblios.2015.227148Application of TF-IDF factor in the semantic analysis of a documentary collectionAndrés Vuotto0Celeste Bogetti1Gladys Fernández2Universidad Nacional de Mar del Plata - MDPUniversidad Nacional de Mar del Plata - MDPUniversidad Nacional de Mar del Plata - MDP<strong>Objective</strong>. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). <strong>Methodology</strong>. A system based on PHP and MySQL database for the management of a thesaurus, the calculation of TF – IDF (as an indicator of semantic weight) and for development a relevance tree (consisting of those concepts is developed most relevant issue analyzed). The tool was tested to the semantic analysis of a documentary collection of Psychology. <strong>Results</strong>. The system was able to identify the level of track presence: professional ethics, in a collection of documents Psychology program. <strong>Conclusions</strong>. The experience described confirms the viability of the tool for the semantic analysis of a documentary collection. It underlines the relevance and capacities of information professionals to develop this kind of tools for processing information. The authors suggests a special technical approach for use of scripts and information flows.http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227Análisis semánticoTF-IDFRecuperación de informaciónMinería de datosExtracción de información en bases de datos |
spellingShingle | Andrés Vuotto Celeste Bogetti Gladys Fernández Application of TF-IDF factor in the semantic analysis of a documentary collection Biblios Análisis semántico TF-IDF Recuperación de información Minería de datos Extracción de información en bases de datos |
title | Application of TF-IDF factor in the semantic analysis of a documentary collection |
title_full | Application of TF-IDF factor in the semantic analysis of a documentary collection |
title_fullStr | Application of TF-IDF factor in the semantic analysis of a documentary collection |
title_full_unstemmed | Application of TF-IDF factor in the semantic analysis of a documentary collection |
title_short | Application of TF-IDF factor in the semantic analysis of a documentary collection |
title_sort | application of tf idf factor in the semantic analysis of a documentary collection |
topic | Análisis semántico TF-IDF Recuperación de información Minería de datos Extracción de información en bases de datos |
url | http://biblios.pitt.edu/ojs/index.php/biblios/article/view/227 |
work_keys_str_mv | AT andresvuotto applicationoftfidffactorinthesemanticanalysisofadocumentarycollection AT celestebogetti applicationoftfidffactorinthesemanticanalysisofadocumentarycollection AT gladysfernandez applicationoftfidffactorinthesemanticanalysisofadocumentarycollection |