Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media

Mental health disorders pose a significant global public health challenge. Social media data provides insights into these conditions. Analysing text can help identify indications of mental health disorders through text-based analysis. However, despite the large number of studies on the analysis of...

Full description

Bibliographic Details
Main Authors: Abi Nizar Sutranggono, Riyanarto Sarno, Imam Ghozali
Format: Article
Language:English
Published: UUM Press 2024-01-01
Series:Journal of ICT
Subjects:
Online Access:https://e-journal.uum.edu.my/index.php/jict/article/view/19042
_version_ 1797339204759322624
author Abi Nizar Sutranggono
Riyanarto Sarno
Imam Ghozali
author_facet Abi Nizar Sutranggono
Riyanarto Sarno
Imam Ghozali
author_sort Abi Nizar Sutranggono
collection DOAJ
description Mental health disorders pose a significant global public health challenge. Social media data provides insights into these conditions. Analysing text can help identify indications of mental health disorders through text-based analysis. However, despite the large number of studies on the analysis of mental health disorders, the predominant algorithm in the existing literature is the Multi-Class Single-Level (MCSL) classification algorithm, which is often used for simple classification tasks involving a limited number of classes. Typically, these classes are binary, representing either an unhealthy or a healthy mental state. This paper uses English text data from Reddit to classify mental health disorders. The Multi-Class Multi-Level (MCML) classification algorithm was applied to perform detailed classification and address the limitations of the research scope using several approaches, including machine learning, deep learning, and transfer learning approaches. Two different pre-processing scenarios were proposed to handle unstructured text data, one of the most challenging aspects of classifying text from social media. The results of the experiments show that the MCML classification algorithm successfully performs detailed classification and produces promising results for each classification level. The proposed pre-processing scenario influences the performance of each classifier and improves classification accuracy. The best accuracy results were obtained for the Robustly Optimised BERT Pre-training Approach (RoBERTa) classifier at level 1 and level 2 classifications, namely 0.98 and 0.85, respectively. Overall, the MCML classification algorithm is proven to be used as a benchmark for early detection of text-based mental health disorders.
first_indexed 2024-03-08T09:42:34Z
format Article
id doaj.art-3e18735b3ace4a21b81ebd880041912a
institution Directory Open Access Journal
issn 1675-414X
2180-3862
language English
last_indexed 2024-03-08T09:42:34Z
publishDate 2024-01-01
publisher UUM Press
record_format Article
series Journal of ICT
spelling doaj.art-3e18735b3ace4a21b81ebd880041912a2024-01-30T01:36:26ZengUUM PressJournal of ICT1675-414X2180-38622024-01-0123110.32890/jict2024.23.1.4Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social MediaAbi Nizar Sutranggono0Riyanarto Sarno1Imam Ghozali2Department of Informatics, Institut Teknologi Sepuluh Nopember, IndonesiaDepartment of Informatics, Institut Teknologi Sepuluh Nopember, IndonesiaDepartment of Informatics, Institut Teknologi Sepuluh Nopember, Indonesia Mental health disorders pose a significant global public health challenge. Social media data provides insights into these conditions. Analysing text can help identify indications of mental health disorders through text-based analysis. However, despite the large number of studies on the analysis of mental health disorders, the predominant algorithm in the existing literature is the Multi-Class Single-Level (MCSL) classification algorithm, which is often used for simple classification tasks involving a limited number of classes. Typically, these classes are binary, representing either an unhealthy or a healthy mental state. This paper uses English text data from Reddit to classify mental health disorders. The Multi-Class Multi-Level (MCML) classification algorithm was applied to perform detailed classification and address the limitations of the research scope using several approaches, including machine learning, deep learning, and transfer learning approaches. Two different pre-processing scenarios were proposed to handle unstructured text data, one of the most challenging aspects of classifying text from social media. The results of the experiments show that the MCML classification algorithm successfully performs detailed classification and produces promising results for each classification level. The proposed pre-processing scenario influences the performance of each classifier and improves classification accuracy. The best accuracy results were obtained for the Robustly Optimised BERT Pre-training Approach (RoBERTa) classifier at level 1 and level 2 classifications, namely 0.98 and 0.85, respectively. Overall, the MCML classification algorithm is proven to be used as a benchmark for early detection of text-based mental health disorders. https://e-journal.uum.edu.my/index.php/jict/article/view/19042MCML classificationmental health disordersReddittext miningtransfer learning
spellingShingle Abi Nizar Sutranggono
Riyanarto Sarno
Imam Ghozali
Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
Journal of ICT
MCML classification
mental health disorders
Reddit
text mining
transfer learning
title Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
title_full Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
title_fullStr Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
title_full_unstemmed Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
title_short Multi-Class Multi-Level Classification of Mental Health Disorders Based on Textual Data from Social Media
title_sort multi class multi level classification of mental health disorders based on textual data from social media
topic MCML classification
mental health disorders
Reddit
text mining
transfer learning
url https://e-journal.uum.edu.my/index.php/jict/article/view/19042
work_keys_str_mv AT abinizarsutranggono multiclassmultilevelclassificationofmentalhealthdisordersbasedontextualdatafromsocialmedia
AT riyanartosarno multiclassmultilevelclassificationofmentalhealthdisordersbasedontextualdatafromsocialmedia
AT imamghozali multiclassmultilevelclassificationofmentalhealthdisordersbasedontextualdatafromsocialmedia