Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study

BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records des...

Full description

Bibliographic Details
Main Authors: Yang Chen, Xuejiao Liu, Lei Gao, Miao Zhu, Ben-Chang Shia, Mingchih Chen, Linglong Ye, Lei Qin
Format: Article
Language:English
Published: JMIR Publications 2023-06-01
Series:JMIR Medical Informatics
Online Access:https://medinform.jmir.org/2023/1/e41576
_version_ 1797734036346503168
author Yang Chen
Xuejiao Liu
Lei Gao
Miao Zhu
Ben-Chang Shia
Mingchih Chen
Linglong Ye
Lei Qin
author_facet Yang Chen
Xuejiao Liu
Lei Gao
Miao Zhu
Ben-Chang Shia
Mingchih Chen
Linglong Ye
Lei Qin
author_sort Yang Chen
collection DOAJ
description BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. ObjectiveOn the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. MethodsData were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. ResultsAmong the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). ConclusionsWhen monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness.
first_indexed 2024-03-12T12:37:22Z
format Article
id doaj.art-7ad114fb73eb41d79b135202363d3564
institution Directory Open Access Journal
issn 2291-9694
language English
last_indexed 2024-03-12T12:37:22Z
publishDate 2023-06-01
publisher JMIR Publications
record_format Article
series JMIR Medical Informatics
spelling doaj.art-7ad114fb73eb41d79b135202363d35642023-08-29T00:04:31ZengJMIR PublicationsJMIR Medical Informatics2291-96942023-06-0111e4157610.2196/41576Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods StudyYang Chenhttps://orcid.org/0009-0007-9065-2617Xuejiao Liuhttps://orcid.org/0009-0003-5475-6719Lei Gaohttps://orcid.org/0009-0004-7027-811XMiao Zhuhttps://orcid.org/0000-0003-0535-1179Ben-Chang Shiahttps://orcid.org/0000-0003-2854-8361Mingchih Chenhttps://orcid.org/0000-0002-8278-0033Linglong Yehttps://orcid.org/0000-0001-6637-7757Lei Qinhttps://orcid.org/0000-0001-9177-0753 BackgroundWith the advent of electronic storage of medical records and the internet, patients can access web-based medical records. This has facilitated doctor-patient communication and built trust between them. However, many patients avoid using web-based medical records despite their greater availability and readability. ObjectiveOn the basis of demographic and individual behavioral characteristics, this study explores the predictors of web-based medical record nonuse among patients. MethodsData were collected from the National Cancer Institute 2019 to 2020 Health Information National Trends Survey. First, based on the data-rich environment, the chi-square test (categorical variables) and 2-tailed t tests (continuous variables) were performed on the response variables and the variables in the questionnaire. According to the test results, the variables were initially screened, and those that passed the test were selected for subsequent analysis. Second, participants were excluded from the study if any of the initially screened variables were missing. Third, the data obtained were modeled using 5 machine learning algorithms, namely, logistic regression, automatic generalized linear model, automatic random forest, automatic deep neural network, and automatic gradient boosting machine, to identify and investigate factors affecting web-based medical record nonuse. The aforementioned automatic machine learning algorithms were based on the R interface (R Foundation for Statistical Computing) of the H2O (H2O.ai) scalable machine learning platform. Finally, 5-fold cross-validation was adopted for 80% of the data set, which was used as the training data to determine hyperparameters of 5 algorithms, and 20% of the data set was used as the test data for model comparison. ResultsAmong the 9072 respondents, 5409 (59.62%) had no experience using web-based medical records. Using the 5 algorithms, 29 variables were identified as crucial predictors of nonuse of web-based medical records. These 29 variables comprised 6 (21%) sociodemographic variables (age, BMI, race, marital status, education, and income) and 23 (79%) variables related to individual lifestyles and behavioral habits (such as electronic and internet use, individuals’ health status and their level of health concern, etc). H2O’s automatic machine learning methods have a high model accuracy. On the basis of the performance of the validation data set, the optimal model was the automatic random forest with the highest area under the curve in the validation set (88.52%) and the test set (82.87%). ConclusionsWhen monitoring web-based medical record use trends, research should focus on social factors such as age, education, BMI, and marital status, as well as personal lifestyle and behavioral habits, including smoking, use of electronic devices and the internet, patients’ personal health status, and their level of health concern. The use of electronic medical records can be targeted to specific patient groups, allowing more people to benefit from their usefulness.https://medinform.jmir.org/2023/1/e41576
spellingShingle Yang Chen
Xuejiao Liu
Lei Gao
Miao Zhu
Ben-Chang Shia
Mingchih Chen
Linglong Ye
Lei Qin
Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
JMIR Medical Informatics
title Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_full Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_fullStr Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_full_unstemmed Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_short Using the H2O Automatic Machine Learning Algorithms to Identify Predictors of Web-Based Medical Record Nonuse Among Patients in a Data-Rich Environment: Mixed Methods Study
title_sort using the h2o automatic machine learning algorithms to identify predictors of web based medical record nonuse among patients in a data rich environment mixed methods study
url https://medinform.jmir.org/2023/1/e41576
work_keys_str_mv AT yangchen usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT xuejiaoliu usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT leigao usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT miaozhu usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT benchangshia usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT mingchihchen usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT linglongye usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy
AT leiqin usingtheh2oautomaticmachinelearningalgorithmstoidentifypredictorsofwebbasedmedicalrecordnonuseamongpatientsinadatarichenvironmentmixedmethodsstudy