Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents
This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields,...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-04-01
|
Series: | Computation |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-3197/8/2/30 |
_version_ | 1797570642596331520 |
---|---|
author | Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya |
author_facet | Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya |
author_sort | Jose Aguilar |
collection | DOAJ |
description | This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse. |
first_indexed | 2024-03-10T20:27:32Z |
format | Article |
id | doaj.art-dcc7199d9ab2475db7b4716925498ef8 |
institution | Directory Open Access Journal |
issn | 2079-3197 |
language | English |
last_indexed | 2024-03-10T20:27:32Z |
publishDate | 2020-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Computation |
spelling | doaj.art-dcc7199d9ab2475db7b4716925498ef82023-11-19T21:41:11ZengMDPI AGComputation2079-31972020-04-01823010.3390/computation8020030Comparison and Evaluation of Different Methods for the Feature Extraction from Educational ContentsJose Aguilar0Camilo Salazar1Henry Velasco2Julian Monsalve-Pulido3Edwin Montoya4Escuela de Sistemas, Facultad de Ingeniería, Universidad de los Andes, Mérida 5101, VenezuelaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaLANTIA SAS, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaGIDITIC, Universidad EAFIT, Carrera 49 No. 7 Sur 50, Medellin 050001, ColombiaThis paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.https://www.mdpi.com/2079-3197/8/2/30feature extractioncontent analysiseducational contentssemantic representationinformation retrievalrecommendation system |
spellingShingle | Jose Aguilar Camilo Salazar Henry Velasco Julian Monsalve-Pulido Edwin Montoya Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents Computation feature extraction content analysis educational contents semantic representation information retrieval recommendation system |
title | Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents |
title_full | Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents |
title_fullStr | Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents |
title_full_unstemmed | Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents |
title_short | Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents |
title_sort | comparison and evaluation of different methods for the feature extraction from educational contents |
topic | feature extraction content analysis educational contents semantic representation information retrieval recommendation system |
url | https://www.mdpi.com/2079-3197/8/2/30 |
work_keys_str_mv | AT joseaguilar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT camilosalazar comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT henryvelasco comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT julianmonsalvepulido comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents AT edwinmontoya comparisonandevaluationofdifferentmethodsforthefeatureextractionfromeducationalcontents |