17 Data Loofah: A web-based app for efficiently identifying erroneous data
OBJECTIVES/GOALS: The goal was to create and deploy an intuitive, easy-to-use tool that clinical investigators can apply to their data to identify erroneous or inconsistent data entries. Investigators can then correct any errors prior to sharing the data with their statistician for analysis. METHODS...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Cambridge University Press
2023-04-01
|
Series: | Journal of Clinical and Translational Science |
Online Access: | https://www.cambridge.org/core/product/identifier/S2059866123001176/type/journal_article |
_version_ | 1797840488763490304 |
---|---|
author | Jeffrey R. Fine Sandra L. Taylor |
author_facet | Jeffrey R. Fine Sandra L. Taylor |
author_sort | Jeffrey R. Fine |
collection | DOAJ |
description | OBJECTIVES/GOALS: The goal was to create and deploy an intuitive, easy-to-use tool that clinical investigators can apply to their data to identify erroneous or inconsistent data entries. Investigators can then correct any errors prior to sharing the data with their statistician for analysis. METHODS/STUDY POPULATION: We developed an interactive shiny app, the Data Loofah, using R Studio that researchers or data analysts can use to examine data. After an investigator uploads data, the app reports which variables are numeric or categorical. Means, standard deviation, median, 25th and 75th quantiles, range and number of missing values are reported for numeric variables. Counts and percentages of categorical variables are summarized. Graphical displays further enhance identification of errors. Access to the Data Loofah is through a secure, university-maintained website with access restricted to university personnel. Supporting materials consisting of instructional step-by-step handouts and videos were developed to assist investigators in the use of the app. RESULTS/ANTICIPATED RESULTS: We will integrate use of the Data Loofah into our Clinical and Translational Science Program’s biostatistics consultative practice. Investigators will use the Data Loofah to pre-screen their data prior to sending it to a statistician, identify errors and inconsistencies, and facilitate making necessary corrections. Statisticians will also use the Data Loofah to review data with investigators prior to starting analyses. Through use of this app, investigators are expected to develop a better understanding of their data specifically and more generally about requirements for preparing data for statistical analysis. Most significantly, regular use of the Data Loofah is expected to result in higher quality data and more efficient use of statistician resources due to reduced effort for data cleaning. DISCUSSION/SIGNIFICANCE: Data cleaning is a time-consuming task and finding data errors can be difficult for data analysts not familiar with clinical variables under study. Further, failure to identify data errors can lead to erroneous results. By facilitating identification of data errors by clinical investigators, the Data Loofah will improve and enhance research output. |
first_indexed | 2024-04-09T16:16:09Z |
format | Article |
id | doaj.art-90c66be4e56348078bdf7e09f0ad4411 |
institution | Directory Open Access Journal |
issn | 2059-8661 |
language | English |
last_indexed | 2024-04-09T16:16:09Z |
publishDate | 2023-04-01 |
publisher | Cambridge University Press |
record_format | Article |
series | Journal of Clinical and Translational Science |
spelling | doaj.art-90c66be4e56348078bdf7e09f0ad44112023-04-24T05:55:55ZengCambridge University PressJournal of Clinical and Translational Science2059-86612023-04-0175510.1017/cts.2023.11717 Data Loofah: A web-based app for efficiently identifying erroneous dataJeffrey R. Fine0Sandra L. Taylor1University of California, DavisUniversity of California, DavisOBJECTIVES/GOALS: The goal was to create and deploy an intuitive, easy-to-use tool that clinical investigators can apply to their data to identify erroneous or inconsistent data entries. Investigators can then correct any errors prior to sharing the data with their statistician for analysis. METHODS/STUDY POPULATION: We developed an interactive shiny app, the Data Loofah, using R Studio that researchers or data analysts can use to examine data. After an investigator uploads data, the app reports which variables are numeric or categorical. Means, standard deviation, median, 25th and 75th quantiles, range and number of missing values are reported for numeric variables. Counts and percentages of categorical variables are summarized. Graphical displays further enhance identification of errors. Access to the Data Loofah is through a secure, university-maintained website with access restricted to university personnel. Supporting materials consisting of instructional step-by-step handouts and videos were developed to assist investigators in the use of the app. RESULTS/ANTICIPATED RESULTS: We will integrate use of the Data Loofah into our Clinical and Translational Science Program’s biostatistics consultative practice. Investigators will use the Data Loofah to pre-screen their data prior to sending it to a statistician, identify errors and inconsistencies, and facilitate making necessary corrections. Statisticians will also use the Data Loofah to review data with investigators prior to starting analyses. Through use of this app, investigators are expected to develop a better understanding of their data specifically and more generally about requirements for preparing data for statistical analysis. Most significantly, regular use of the Data Loofah is expected to result in higher quality data and more efficient use of statistician resources due to reduced effort for data cleaning. DISCUSSION/SIGNIFICANCE: Data cleaning is a time-consuming task and finding data errors can be difficult for data analysts not familiar with clinical variables under study. Further, failure to identify data errors can lead to erroneous results. By facilitating identification of data errors by clinical investigators, the Data Loofah will improve and enhance research output.https://www.cambridge.org/core/product/identifier/S2059866123001176/type/journal_article |
spellingShingle | Jeffrey R. Fine Sandra L. Taylor 17 Data Loofah: A web-based app for efficiently identifying erroneous data Journal of Clinical and Translational Science |
title | 17 Data Loofah: A web-based app for efficiently identifying erroneous data |
title_full | 17 Data Loofah: A web-based app for efficiently identifying erroneous data |
title_fullStr | 17 Data Loofah: A web-based app for efficiently identifying erroneous data |
title_full_unstemmed | 17 Data Loofah: A web-based app for efficiently identifying erroneous data |
title_short | 17 Data Loofah: A web-based app for efficiently identifying erroneous data |
title_sort | 17 data loofah a web based app for efficiently identifying erroneous data |
url | https://www.cambridge.org/core/product/identifier/S2059866123001176/type/journal_article |
work_keys_str_mv | AT jeffreyrfine 17dataloofahawebbasedappforefficientlyidentifyingerroneousdata AT sandraltaylor 17dataloofahawebbasedappforefficientlyidentifyingerroneousdata |