Unsupervised Machine Learning to Identify Depressive Subtypes

Objectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency prese...

Full description

Bibliographic Details
Main Authors: Benson Kung, Maurice Chiang, Gayan Perera, Megan Pritchard, Robert Stewart
Format: Article
Language:English
Published: The Korean Society of Medical Informatics 2022-07-01
Series:Healthcare Informatics Research
Subjects:
Online Access:http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdf
_version_ 1811179852275908608
author Benson Kung
Maurice Chiang
Gayan Perera
Megan Pritchard
Robert Stewart
author_facet Benson Kung
Maurice Chiang
Gayan Perera
Megan Pritchard
Robert Stewart
author_sort Benson Kung
collection DOAJ
description Objectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency presentations, crisis events, and behavioral problems. One model was chosen for further analysis based upon its potential as a clinically meaningful construct. The associations between patient groups created with the final LDA model and outcomes were tested. These steps were repeated with a commonly-used latent variable model to provide additional context to the LDA results. Results Five subtypes were identified using the final LDA model. Prior to the outcome analysis, the subtypes were labeled based upon the symptom distributions they produced: psychotic, severe, mild, agitated, and anergic-apathetic. The patient groups largely aligned with the outcome data. For example, the psychotic and severe subgroups were more likely to have emergency presentations (odds ratio [OR] = 1.29; 95% confidence interval [CI], 1.17–1.43 and OR = 1.16; 95% CI, 1.05–1.29, respectively), whereas these outcomes were less likely in the mild subgroup (OR = 0.86; 95% CI, 0.78–0.94). We found that the LDA subtypes were characterized by clusters of unique symptoms. This contrasted with the latent variable model subtypes, which were largely stratified by severity. Conclusions This study suggests that LDA can surface clinically meaningful, qualitative subtypes. Future work could be incorporated into studies concerning the biological bases of depression, thereby contributing to the development of new psychiatric therapeutics.
first_indexed 2024-04-11T06:40:30Z
format Article
id doaj.art-777adad5219c489bb377df3559883890
institution Directory Open Access Journal
issn 2093-3681
2093-369X
language English
last_indexed 2024-04-11T06:40:30Z
publishDate 2022-07-01
publisher The Korean Society of Medical Informatics
record_format Article
series Healthcare Informatics Research
spelling doaj.art-777adad5219c489bb377df35598838902022-12-22T04:39:33ZengThe Korean Society of Medical InformaticsHealthcare Informatics Research2093-36812093-369X2022-07-0128325626610.4258/hir.2022.28.3.2561128Unsupervised Machine Learning to Identify Depressive SubtypesBenson Kung0Maurice Chiang1Gayan Perera2Megan Pritchard3Robert Stewart4 Carbon Health, San Mateo, CA, USA Carbon Health, San Mateo, CA, USA Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UKObjectives This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data. Methods Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency presentations, crisis events, and behavioral problems. One model was chosen for further analysis based upon its potential as a clinically meaningful construct. The associations between patient groups created with the final LDA model and outcomes were tested. These steps were repeated with a commonly-used latent variable model to provide additional context to the LDA results. Results Five subtypes were identified using the final LDA model. Prior to the outcome analysis, the subtypes were labeled based upon the symptom distributions they produced: psychotic, severe, mild, agitated, and anergic-apathetic. The patient groups largely aligned with the outcome data. For example, the psychotic and severe subgroups were more likely to have emergency presentations (odds ratio [OR] = 1.29; 95% confidence interval [CI], 1.17–1.43 and OR = 1.16; 95% CI, 1.05–1.29, respectively), whereas these outcomes were less likely in the mild subgroup (OR = 0.86; 95% CI, 0.78–0.94). We found that the LDA subtypes were characterized by clusters of unique symptoms. This contrasted with the latent variable model subtypes, which were largely stratified by severity. Conclusions This study suggests that LDA can surface clinically meaningful, qualitative subtypes. Future work could be incorporated into studies concerning the biological bases of depression, thereby contributing to the development of new psychiatric therapeutics.http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdfpsychiatrydepressionmental healthmachine learningmedical informatics
spellingShingle Benson Kung
Maurice Chiang
Gayan Perera
Megan Pritchard
Robert Stewart
Unsupervised Machine Learning to Identify Depressive Subtypes
Healthcare Informatics Research
psychiatry
depression
mental health
machine learning
medical informatics
title Unsupervised Machine Learning to Identify Depressive Subtypes
title_full Unsupervised Machine Learning to Identify Depressive Subtypes
title_fullStr Unsupervised Machine Learning to Identify Depressive Subtypes
title_full_unstemmed Unsupervised Machine Learning to Identify Depressive Subtypes
title_short Unsupervised Machine Learning to Identify Depressive Subtypes
title_sort unsupervised machine learning to identify depressive subtypes
topic psychiatry
depression
mental health
machine learning
medical informatics
url http://www.e-hir.org/upload/pdf/hir-2022-28-3-256.pdf
work_keys_str_mv AT bensonkung unsupervisedmachinelearningtoidentifydepressivesubtypes
AT mauricechiang unsupervisedmachinelearningtoidentifydepressivesubtypes
AT gayanperera unsupervisedmachinelearningtoidentifydepressivesubtypes
AT meganpritchard unsupervisedmachinelearningtoidentifydepressivesubtypes
AT robertstewart unsupervisedmachinelearningtoidentifydepressivesubtypes