Unsupervised Machine Learning to Identify Depressive Subtypes
Objectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency prese...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The Korean Society of Medical Informatics
2022-07-01
|
Series: | Healthcare Informatics Research |
Subjects: | |
Online Access: | http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdf |
_version_ | 1811179852275908608 |
---|---|
author | Benson Kung Maurice Chiang Gayan Perera Megan Pritchard Robert Stewart |
author_facet | Benson Kung Maurice Chiang Gayan Perera Megan Pritchard Robert Stewart |
author_sort | Benson Kung |
collection | DOAJ |
description | Objectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency presentations, crisis events, and behavioral problems. One model was chosen for further analysis based upon its potential as a clinically meaningful construct. The associations between patient groups created with the final LDA model and outcomes were tested. These steps were repeated with a commonly-used latent variable model to provide additional context to the LDA results. Results Five subtypes were identified using the final LDA model. Prior to the outcome analysis, the subtypes were labeled based upon the symptom distributions they produced: psychotic, severe, mild, agitated, and anergic-apathetic. The patient groups largely aligned with the outcome data. For example, the psychotic and severe subgroups were more likely to have emergency presentations (odds ratio [OR] = 1.29; 95% confidence interval [CI], 1.17–1.43 and OR = 1.16; 95% CI, 1.05–1.29, respectively), whereas these outcomes were less likely in the mild subgroup (OR = 0.86; 95% CI, 0.78–0.94). We found that the LDA subtypes were characterized by clusters of unique symptoms. This contrasted with the latent variable model subtypes, which were largely stratified by severity. Conclusions This study suggests that LDA can surface clinically meaningful, qualitative subtypes. Future work could be incorporated into studies concerning the biological bases of depression, thereby contributing to the development of new psychiatric therapeutics. |
first_indexed | 2024-04-11T06:40:30Z |
format | Article |
id | doaj.art-777adad5219c489bb377df3559883890 |
institution | Directory Open Access Journal |
issn | 2093-3681 2093-369X |
language | English |
last_indexed | 2024-04-11T06:40:30Z |
publishDate | 2022-07-01 |
publisher | The Korean Society of Medical Informatics |
record_format | Article |
series | Healthcare Informatics Research |
spelling | doaj.art-777adad5219c489bb377df35598838902022-12-22T04:39:33ZengThe Korean Society of Medical InformaticsHealthcare Informatics Research2093-36812093-369X2022-07-0128325626610.4258/hir.2022.28.3.2561128Unsupervised Machine Learning to Identify Depressive SubtypesBenson Kung0Maurice Chiang1Gayan Perera2Megan Pritchard3Robert Stewart4 Carbon Health, San Mateo, CA, USA Carbon Health, San Mateo, CA, USA Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UKObjectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency presentations, crisis events, and behavioral problems. One model was chosen for further analysis based upon its potential as a clinically meaningful construct. The associations between patient groups created with the final LDA model and outcomes were tested. These steps were repeated with a commonly-used latent variable model to provide additional context to the LDA results. Results Five subtypes were identified using the final LDA model. Prior to the outcome analysis, the subtypes were labeled based upon the symptom distributions they produced: psychotic, severe, mild, agitated, and anergic-apathetic. The patient groups largely aligned with the outcome data. For example, the psychotic and severe subgroups were more likely to have emergency presentations (odds ratio [OR] = 1.29; 95% confidence interval [CI], 1.17–1.43 and OR = 1.16; 95% CI, 1.05–1.29, respectively), whereas these outcomes were less likely in the mild subgroup (OR = 0.86; 95% CI, 0.78–0.94). We found that the LDA subtypes were characterized by clusters of unique symptoms. This contrasted with the latent variable model subtypes, which were largely stratified by severity. Conclusions This study suggests that LDA can surface clinically meaningful, qualitative subtypes. Future work could be incorporated into studies concerning the biological bases of depression, thereby contributing to the development of new psychiatric therapeutics.http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdfpsychiatrydepressionmental healthmachine learningmedical informatics |
spellingShingle | Benson Kung Maurice Chiang Gayan Perera Megan Pritchard Robert Stewart Unsupervised Machine Learning to Identify Depressive Subtypes Healthcare Informatics Research psychiatry depression mental health machine learning medical informatics |
title | Unsupervised Machine Learning to Identify Depressive Subtypes |
title_full | Unsupervised Machine Learning to Identify Depressive Subtypes |
title_fullStr | Unsupervised Machine Learning to Identify Depressive Subtypes |
title_full_unstemmed | Unsupervised Machine Learning to Identify Depressive Subtypes |
title_short | Unsupervised Machine Learning to Identify Depressive Subtypes |
title_sort | unsupervised machine learning to identify depressive subtypes |
topic | psychiatry depression mental health machine learning medical informatics |
url | http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdf |
work_keys_str_mv | AT bensonkung unsupervisedmachinelearningtoidentifydepressivesubtypes AT mauricechiang unsupervisedmachinelearningtoidentifydepressivesubtypes AT gayanperera unsupervisedmachinelearningtoidentifydepressivesubtypes AT meganpritchard unsupervisedmachinelearningtoidentifydepressivesubtypes AT robertstewart unsupervisedmachinelearningtoidentifydepressivesubtypes |