Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records des...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JMIR Publications
2023-06-01
|
Series: | JMIR Medical Informatics |
Online Access: | https://medinform.jmir.org/2023/1/e41576 |
_version_ | 1797734036346503168 |
---|---|
author | Yang Chen Xuejiao Liu Lei Gao Miao Zhu Ben-Chang Shia Mingchih Chen Linglong Ye Lei Qin |
author_facet | Yang Chen Xuejiao Liu Lei Gao Miao Zhu Ben-Chang Shia Mingchih Chen Linglong Ye Lei Qin |
author_sort | Yang Chen |
collection | DOAJ |
description |
BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability.
ObjectiveOn the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients.
MethodsData were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison.
ResultsAmong the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%).
ConclusionsWhen monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness. |
first_indexed | 2024-03-12T12:37:22Z |
format | Article |
id | doaj.art-7ad114fb73eb41d79b135202363d3564 |
institution | Directory Open Access Journal |
issn | 2291-9694 |
language | English |
last_indexed | 2024-03-12T12:37:22Z |
publishDate | 2023-06-01 |
publisher | JMIR Publications |
record_format | Article |
series | JMIR Medical Informatics |
spelling | doaj.art-7ad114fb73eb41d79b135202363d35642023-08-29T00:04:31ZengJMIR PublicationsJMIR Medical Informatics2291-96942023-06-0111e4157610.2196/41576Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods StudyYang Chenhttps://orcid.org/0009-0007-9065-2617Xuejiao Liuhttps://orcid.org/0009-0003-5475-6719Lei Gaohttps://orcid.org/0009-0004-7027-811XMiao Zhuhttps://orcid.org/0000-0003-0535-1179Ben-Chang Shiahttps://orcid.org/0000-0003-2854-8361Mingchih Chenhttps://orcid.org/0000-0002-8278-0033Linglong Yehttps://orcid.org/0000-0001-6637-7757Lei Qinhttps://orcid.org/0000-0001-9177-0753 BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. ObjectiveOn the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. MethodsData were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. ResultsAmong the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). ConclusionsWhen monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness.https://medinform.jmir.org/2023/1/e41576 |
spellingShingle | Yang Chen Xuejiao Liu Lei Gao Miao Zhu Ben-Chang Shia Mingchih Chen Linglong Ye Lei Qin Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study JMIR Medical Informatics |
title | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_full | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_fullStr | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_full_unstemmed | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_short | Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study |
title_sort | using the h2o automatic machine learning algorithms to identify predictors of web based medical record nonuse among patients in a data rich environment mixed methods study |
url | https://medinform.jmir.org/2023/1/e41576 |
work_keys_str_mv | AT yangchen usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT xuejiaoliu usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT leigao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT miaozhu usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT benchangshia usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT mingchihchen usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT linglongye usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy AT leiqin usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy |