Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests

Developmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative f...

Full description

Bibliographic Details
Main Authors:	Shahriar Kaisar, Abdullahi Chowdhury
Format:	Article
Language:	English
Published:	Elsevier 2022-12-01
Series:	ICT Express
Subjects:	Dyslexia Imbalanced data Ensemble technique Machine learning Oversampling
Online Access:	http://www.sciencedirect.com/science/article/pii/S2405959522000327

_version_	1828281733309530112
author	Shahriar Kaisar Abdullahi Chowdhury
author_facet	Shahriar Kaisar Abdullahi Chowdhury
author_sort	Shahriar Kaisar
collection	DOAJ
description	Developmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative feelings. Early detection of Dyslexia can be highly beneficial for dyslexic children as their learning needs can be properly addressed. Researchers have used several testing techniques for early discovery where the data is collected from reading and writing tests, online games, Magnetic reasoning imaging (MRI) and Electroencephalography (EEG) scans, picture and video recording. Several Machine learning techniques have also been used in this regard recently. However, existing works did not focus on the problem of the imbalanced dataset where the percentage of dyslexic participants is much higher compared to non-dyslexic participants, which is expected to be the case for pre-screening among a random population. This paper addresses the imbalanced dataset obtained from dyslexia pre-screening tests and proposes an oversampling and ensemble-based machine learning technique for the detection of Dyslexia. Simulation results show that the proposed approach improves the detection accuracy of the minority class, i.e., dyslexic patients from 80.61% to 83.52%.
first_indexed	2024-04-13T08:20:01Z
format	Article
id	doaj.art-3873af79044f4d4c98b3023cd55f5873
institution	Directory Open Access Journal
issn	2405-9595
language	English
last_indexed	2024-04-13T08:20:01Z
publishDate	2022-12-01
publisher	Elsevier
record_format	Article
series	ICT Express
spelling	doaj.art-3873af79044f4d4c98b3023cd55f58732022-12-22T02:54:41ZengElsevierICT Express2405-95952022-12-0184563568Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening testsShahriar Kaisar0Abdullahi Chowdhury1Department of Information Systems and Business Analytics, RMIT University, Australia; Corresponding author.Faculty of Engineering, Computer and Mathematical Sciences, University of Adelaide, AustraliaDevelopmental Dyslexia is a learning disorder often discovered in school-aged children who face difficulties while reading or spelling words even though they may have average or above-average levels of intelligence. This ultimately results in anger, frustration, low self-esteem, and other negative feelings. Early detection of Dyslexia can be highly beneficial for dyslexic children as their learning needs can be properly addressed. Researchers have used several testing techniques for early discovery where the data is collected from reading and writing tests, online games, Magnetic reasoning imaging (MRI) and Electroencephalography (EEG) scans, picture and video recording. Several Machine learning techniques have also been used in this regard recently. However, existing works did not focus on the problem of the imbalanced dataset where the percentage of dyslexic participants is much higher compared to non-dyslexic participants, which is expected to be the case for pre-screening among a random population. This paper addresses the imbalanced dataset obtained from dyslexia pre-screening tests and proposes an oversampling and ensemble-based machine learning technique for the detection of Dyslexia. Simulation results show that the proposed approach improves the detection accuracy of the minority class, i.e., dyslexic patients from 80.61% to 83.52%.http://www.sciencedirect.com/science/article/pii/S2405959522000327DyslexiaImbalanced dataEnsemble techniqueMachine learningOversampling
spellingShingle	Shahriar Kaisar Abdullahi Chowdhury Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests ICT Express Dyslexia Imbalanced data Ensemble technique Machine learning Oversampling
title	Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
title_full	Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
title_fullStr	Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
title_full_unstemmed	Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
title_short	Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests
title_sort	integrating oversampling and ensemble based machine learning techniques for an imbalanced dataset in dyslexia screening tests
topic	Dyslexia Imbalanced data Ensemble technique Machine learning Oversampling
url	http://www.sciencedirect.com/science/article/pii/S2405959522000327
work_keys_str_mv	AT shahriarkaisar integratingoversamplingandensemblebasedmachinelearningtechniquesforanimbalanceddatasetindyslexiascreeningtests AT abdullahichowdhury integratingoversamplingandensemblebasedmachinelearningtechniquesforanimbalanceddatasetindyslexiascreeningtests

Integrating oversampling and ensemble-based machine learning techniques for an imbalanced dataset in dyslexia screening tests

Similar Items