Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
Developmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative f...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-12-01
|
Series: | ICT Express |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405959522000327 |
_version_ | 1828281733309530112 |
---|---|
author | Shahriar Kaisar Abdullahi Chowdhury |
author_facet | Shahriar Kaisar Abdullahi Chowdhury |
author_sort | Shahriar Kaisar |
collection | DOAJ |
description | Developmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative feelings. Early detection of Dyslexia can be highly beneficial for dyslexic children as their learning needs can be properly addressed. Researchers have used several testing techniques for early discovery where the data is collected from reading and writing tests, online games, Magnetic reasoning imaging (MRI) and Electroencephalography (EEG) scans, picture and video recording. Several Machine learning techniques have also been used in this regard recently. However, existing works did not focus on the problem of the imbalanced dataset where the percentage of dyslexic participants is much higher compared to non-dyslexic participants, which is expected to be the case for pre-screening among a random population. This paper addresses the imbalanced dataset obtained from dyslexia pre-screening tests and proposes an oversampling and ensemble-based machine learning technique for the detection of Dyslexia. Simulation results show that the proposed approach improves the detection accuracy of the minority class, i.e., dyslexic patients from 80.61% to 83.52%. |
first_indexed | 2024-04-13T08:20:01Z |
format | Article |
id | doaj.art-3873af79044f4d4c98b3023cd55f5873 |
institution | Directory Open Access Journal |
issn | 2405-9595 |
language | English |
last_indexed | 2024-04-13T08:20:01Z |
publishDate | 2022-12-01 |
publisher | Elsevier |
record_format | Article |
series | ICT Express |
spelling | doaj.art-3873af79044f4d4c98b3023cd55f58732022-12-22T02:54:41ZengElsevierICT Express2405-95952022-12-0184563568Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening testsShahriar Kaisar0Abdullahi Chowdhury1Department of Information Systems and Business Analytics, RMIT University, Australia; Corresponding author.Faculty of Engineering, Computer and Mathematical Sciences, University of Adelaide, AustraliaDevelopmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative feelings. Early detection of Dyslexia can be highly beneficial for dyslexic children as their learning needs can be properly addressed. Researchers have used several testing techniques for early discovery where the data is collected from reading and writing tests, online games, Magnetic reasoning imaging (MRI) and Electroencephalography (EEG) scans, picture and video recording. Several Machine learning techniques have also been used in this regard recently. However, existing works did not focus on the problem of the imbalanced dataset where the percentage of dyslexic participants is much higher compared to non-dyslexic participants, which is expected to be the case for pre-screening among a random population. This paper addresses the imbalanced dataset obtained from dyslexia pre-screening tests and proposes an oversampling and ensemble-based machine learning technique for the detection of Dyslexia. Simulation results show that the proposed approach improves the detection accuracy of the minority class, i.e., dyslexic patients from 80.61% to 83.52%.http://www.sciencedirect.com/science/article/pii/S2405959522000327DyslexiaImbalanced dataEnsemble techniqueMachine learningOversampling |
spellingShingle | Shahriar Kaisar Abdullahi Chowdhury Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests ICT Express Dyslexia Imbalanced data Ensemble technique Machine learning Oversampling |
title | Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
title_full | Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
title_fullStr | Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
title_full_unstemmed | Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
title_short | Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
title_sort | integrating oversampling and ensemble based machine learning techniques for an imbalanced dataset in dyslexia screening tests |
topic | Dyslexia Imbalanced data Ensemble technique Machine learning Oversampling |
url | http://www.sciencedirect.com/science/article/pii/S2405959522000327 |
work_keys_str_mv | AT shahriarkaisar integratingoversamplingandensemblebasedmachinelearningtechniquesforanimbalanceddatasetindyslexiascreeningtests AT abdullahichowdhury integratingoversamplingandensemblebasedmachinelearningtechniquesforanimbalanceddatasetindyslexiascreeningtests |