Learning Distributed Representations and Deep Embedded Clustering of Texts
Instructors face significant time and effort constraints when grading students’ assessments on a large scale. Clustering similar assessments is a unique and effective technique that has the potential to significantly reduce the workload of instructors in online and large-scale learning environments....
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-03-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/16/3/158 |
_version_ | 1797613926837387264 |
---|---|
author | Shuang Wang Amin Beheshti Yufei Wang Jianchao Lu Quan Z. Sheng Stephen Elbourn Hamid Alinejad-Rokny |
author_facet | Shuang Wang Amin Beheshti Yufei Wang Jianchao Lu Quan Z. Sheng Stephen Elbourn Hamid Alinejad-Rokny |
author_sort | Shuang Wang |
collection | DOAJ |
description | Instructors face significant time and effort constraints when grading students’ assessments on a large scale. Clustering similar assessments is a unique and effective technique that has the potential to significantly reduce the workload of instructors in online and large-scale learning environments. By grouping together similar assessments, marking one assessment in a cluster can be scaled to other similar assessments, allowing for a more efficient and streamlined grading process. To address this issue, this paper focuses on text assessments and proposes a method for reducing the workload of instructors by clustering similar assessments. The proposed method involves the use of distributed representation to transform texts into vectors, and contrastive learning to improve the representation that distinguishes the differences among similar texts. The paper presents a general framework for clustering similar texts that includes label representation, K-means, and self-organization map algorithms, with the objective of improving clustering performance using Accuracy (ACC) and Normalized Mutual Information (NMI) metrics. The proposed framework is evaluated experimentally using two real datasets. The results show that self-organization maps and K-means algorithms with Pre-trained language models outperform label representation algorithms for different datasets. |
first_indexed | 2024-03-11T07:02:34Z |
format | Article |
id | doaj.art-6f3cba7e1fa2435bb281dc1d67166cb5 |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-11T07:02:34Z |
publishDate | 2023-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-6f3cba7e1fa2435bb281dc1d67166cb52023-11-17T09:09:23ZengMDPI AGAlgorithms1999-48932023-03-0116315810.3390/a16030158Learning Distributed Representations and Deep Embedded Clustering of TextsShuang Wang0Amin Beheshti1Yufei Wang2Jianchao Lu3Quan Z. Sheng4Stephen Elbourn5Hamid Alinejad-Rokny6School of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaInstructors face significant time and effort constraints when grading students’ assessments on a large scale. Clustering similar assessments is a unique and effective technique that has the potential to significantly reduce the workload of instructors in online and large-scale learning environments. By grouping together similar assessments, marking one assessment in a cluster can be scaled to other similar assessments, allowing for a more efficient and streamlined grading process. To address this issue, this paper focuses on text assessments and proposes a method for reducing the workload of instructors by clustering similar assessments. The proposed method involves the use of distributed representation to transform texts into vectors, and contrastive learning to improve the representation that distinguishes the differences among similar texts. The paper presents a general framework for clustering similar texts that includes label representation, K-means, and self-organization map algorithms, with the objective of improving clustering performance using Accuracy (ACC) and Normalized Mutual Information (NMI) metrics. The proposed framework is evaluated experimentally using two real datasets. The results show that self-organization maps and K-means algorithms with Pre-trained language models outperform label representation algorithms for different datasets.https://www.mdpi.com/1999-4893/16/3/158distributed representationdeep clusteringdata augmentationcontrastive learningartificial intelligence |
spellingShingle | Shuang Wang Amin Beheshti Yufei Wang Jianchao Lu Quan Z. Sheng Stephen Elbourn Hamid Alinejad-Rokny Learning Distributed Representations and Deep Embedded Clustering of Texts Algorithms distributed representation deep clustering data augmentation contrastive learning artificial intelligence |
title | Learning Distributed Representations and Deep Embedded Clustering of Texts |
title_full | Learning Distributed Representations and Deep Embedded Clustering of Texts |
title_fullStr | Learning Distributed Representations and Deep Embedded Clustering of Texts |
title_full_unstemmed | Learning Distributed Representations and Deep Embedded Clustering of Texts |
title_short | Learning Distributed Representations and Deep Embedded Clustering of Texts |
title_sort | learning distributed representations and deep embedded clustering of texts |
topic | distributed representation deep clustering data augmentation contrastive learning artificial intelligence |
url | https://www.mdpi.com/1999-4893/16/3/158 |
work_keys_str_mv | AT shuangwang learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT aminbeheshti learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT yufeiwang learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT jianchaolu learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT quanzsheng learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT stephenelbourn learningdistributedrepresentationsanddeepembeddedclusteringoftexts AT hamidalinejadrokny learningdistributedrepresentationsanddeepembeddedclusteringoftexts |