Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents

This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields,...

Full description

Bibliographic Details
Main Authors: Jose Aguilar, Camilo Salazar, Henry Velasco, Julian Monsalve-Pulido, Edwin Montoya
Format: Article
Language:English
Published: MDPI AG 2020-04-01
Series:Computation
Subjects:
Online Access:https://www.mdpi.com/2079-3197/8/2/30
_version_ 1797570642596331520
author Jose Aguilar
Camilo Salazar
Henry Velasco
Julian Monsalve-Pulido
Edwin Montoya
author_facet Jose Aguilar
Camilo Salazar
Henry Velasco
Julian Monsalve-Pulido
Edwin Montoya
author_sort Jose Aguilar
collection DOAJ
description This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.
first_indexed 2024-03-10T20:27:32Z
format Article
id doaj.art-dcc7199d9ab2475db7b4716925498ef8
institution Directory Open Access Journal
issn 2079-3197
language English
last_indexed 2024-03-10T20:27:32Z
publishDate 2020-04-01
publisher MDPI AG
record_format Article
series Computation
spelling doaj.art-dcc7199d9ab2475db7b4716925498ef82023-11-19T21:41:11ZengMDPI AGComputation2079-31972020-04-01823010.3390/computation8020030Comparison and Evaluation of Different Methods for the Feature Extraction from Educational ContentsJose Aguilar0Camilo Salazar1Henry Velasco2Julian Monsalve-Pulido3Edwin Montoya4Escuela de Sistemas, Facultad de Ingeniería, Universidad de los Andes, Mérida 5101, VenezuelaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaLANTIA SAS, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaThis paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.https://www.mdpi.com/2079-3197/8/2/30feature extractioncontent analysiseducational contentssemantic representationinformation retrievalrecommendation system
spellingShingle Jose Aguilar
Camilo Salazar
Henry Velasco
Julian Monsalve-Pulido
Edwin Montoya
Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
Computation
feature extraction
content analysis
educational contents
semantic representation
information retrieval
recommendation system
title Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_full Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_fullStr Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_full_unstemmed Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_short Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_sort comparison and evaluation of different methods for the feature extraction from educational contents
topic feature extraction
content analysis
educational contents
semantic representation
information retrieval
recommendation system
url https://www.mdpi.com/2079-3197/8/2/30
work_keys_str_mv AT joseaguilar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents
AT camilosalazar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents
AT henryvelasco comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents
AT julianmonsalvepulido comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents
AT edwinmontoya comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents