Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents

This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields,...

Full description

Bibliographic Details
Main Authors:	Jose Aguilar, Camilo Salazar, Henry Velasco, Julian Monsalve-Pulido, Edwin Montoya
Format:	Article
Language:	English
Published:	MDPI AG 2020-04-01
Series:	Computation
Subjects:	feature extraction content analysis educational contents semantic representation information retrieval recommendation system
Online Access:	https://www.mdpi.com/2079-3197/8/2/30

_version_	1797570642596331520
author	Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya
author_facet	Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya
author_sort	Jose Aguilar
collection	DOAJ
description	This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.
first_indexed	2024-03-10T20:27:32Z
format	Article
id	doaj.art-dcc7199d9ab2475db7b4716925498ef8
institution	Directory Open Access Journal
issn	2079-3197
language	English
last_indexed	2024-03-10T20:27:32Z
publishDate	2020-04-01
publisher	MDPI AG
record_format	Article
series	Computation
spelling	doaj.art-dcc7199d9ab2475db7b4716925498ef82023-11-19T21:41:11ZengMDPI AGComputation2079-31972020-04-01823010.3390/computation8020030Comparison and Evaluation of Different Methods for the Feature Extraction from Educational ContentsJose Aguilar0Camilo Salazar1Henry Velasco2Julian Monsalve-Pulido3Edwin Montoya4Escuela de Sistemas, Facultad de Ingeniería, Universidad de los Andes, Mérida 5101, VenezuelaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaLANTIA SAS, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaThis paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.https://www.mdpi.com/2079-3197/8/2/30feature extractioncontent analysiseducational contentssemantic representationinformation retrievalrecommendation system
spellingShingle	Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents Computation feature extraction content analysis educational contents semantic representation information retrieval recommendation system
title	Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_full	Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_fullStr	Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_full_unstemmed	Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_short	Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
title_sort	comparison and evaluation of different methods for the feature extraction from educational contents
topic	feature extraction content analysis educational contents semantic representation information retrieval recommendation system
url	https://www.mdpi.com/2079-3197/8/2/30
work_keys_str_mv	AT joseaguilar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT camilosalazar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT henryvelasco comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT julianmonsalvepulido comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT edwinmontoya comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents

Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents

Similar Items