Datathons and Software to Promote Reproducible Research

Background: Datathons facilitate collaboration between clinicians, statisticians, and data scientists in order to answer important clinical questions. Previous datathons have resulted in numerous publications of interest to the critical care community and serve as a viable model for interdisciplinar...

Full description

Bibliographic Details
Main Authors: Celi, Leo Anthony G., Lokhandwala, Sharukh, Montgomery, Robert, Moses, Christopher A, Pollard, Tom Joseph, Stretch, Robert, Spitz, Daniel, Naumann, Tristan Josef
Other Authors: Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Format: Article
Language:en_US
Published: Gunther Eysenbach, JMIR 2017
Online Access:http://hdl.handle.net/1721.1/107975
https://orcid.org/0000-0002-2573-388X
https://orcid.org/0000-0003-2150-1747
https://orcid.org/0000-0002-5676-7898
_version_ 1826217804681969664
author Celi, Leo Anthony G.
Lokhandwala, Sharukh
Montgomery, Robert
Moses, Christopher A
Pollard, Tom Joseph
Stretch, Robert
Spitz, Daniel
Naumann, Tristan Josef
author2 Massachusetts Institute of Technology. Institute for Medical Engineering & Science
author_facet Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Celi, Leo Anthony G.
Lokhandwala, Sharukh
Montgomery, Robert
Moses, Christopher A
Pollard, Tom Joseph
Stretch, Robert
Spitz, Daniel
Naumann, Tristan Josef
author_sort Celi, Leo Anthony G.
collection MIT
description Background: Datathons facilitate collaboration between clinicians, statisticians, and data scientists in order to answer important clinical questions. Previous datathons have resulted in numerous publications of interest to the critical care community and serve as a viable model for interdisciplinary collaboration. Objective: We report on an open-source software called Chatto that was created by members of our group, in the context of the second international Critical Care Datathon, held in September 2015. Methods: Datathon participants formed teams to discuss potential research questions and the methods required to address them. They were provided with the Chatto suite of tools to facilitate their teamwork. Each multidisciplinary team spent the next 2 days with clinicians working alongside data scientists to write code, extract and analyze data, and reformulate their queries in real time as needed. All projects were then presented on the last day of the datathon to a panel of judges that consisted of clinicians and scientists. Results: Use of Chatto was particularly effective in the datathon setting, enabling teams to reduce the time spent configuring their research environments to just a few minutes—a process that would normally take hours to days. Chatto continued to serve as a useful research tool after the conclusion of the datathon. Conclusions: This suite of tools fulfills two purposes: (1) facilitation of interdisciplinary teamwork through archiving and version control of datasets, analytical code, and team discussions, and (2) advancement of research reproducibility by functioning postpublication as an online environment in which independent investigators can rerun or modify analyses with relative ease. With the introduction of Chatto, we hope to solve a variety of challenges presented by collaborative data mining projects while improving research reproducibility.
first_indexed 2024-09-23T17:09:25Z
format Article
id mit-1721.1/107975
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T17:09:25Z
publishDate 2017
publisher Gunther Eysenbach, JMIR
record_format dspace
spelling mit-1721.1/1079752022-10-03T10:50:12Z Datathons and Software to Promote Reproducible Research Celi, Leo Anthony G. Lokhandwala, Sharukh Montgomery, Robert Moses, Christopher A Pollard, Tom Joseph Stretch, Robert Spitz, Daniel Naumann, Tristan Josef Massachusetts Institute of Technology. Institute for Medical Engineering & Science Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science MIT Critical Data (Laboratory) Celi, Leo Anthony G. Lokhandwala, Sharukh Montgomery, Robert Moses, Christopher A Naumann, Tristan Pollard, Tom Joseph Stretch, Robert Spitz, Daniel Background: Datathons facilitate collaboration between clinicians, statisticians, and data scientists in order to answer important clinical questions. Previous datathons have resulted in numerous publications of interest to the critical care community and serve as a viable model for interdisciplinary collaboration. Objective: We report on an open-source software called Chatto that was created by members of our group, in the context of the second international Critical Care Datathon, held in September 2015. Methods: Datathon participants formed teams to discuss potential research questions and the methods required to address them. They were provided with the Chatto suite of tools to facilitate their teamwork. Each multidisciplinary team spent the next 2 days with clinicians working alongside data scientists to write code, extract and analyze data, and reformulate their queries in real time as needed. All projects were then presented on the last day of the datathon to a panel of judges that consisted of clinicians and scientists. Results: Use of Chatto was particularly effective in the datathon setting, enabling teams to reduce the time spent configuring their research environments to just a few minutes—a process that would normally take hours to days. Chatto continued to serve as a useful research tool after the conclusion of the datathon. Conclusions: This suite of tools fulfills two purposes: (1) facilitation of interdisciplinary teamwork through archiving and version control of datasets, analytical code, and team discussions, and (2) advancement of research reproducibility by functioning postpublication as an online environment in which independent investigators can rerun or modify analyses with relative ease. With the introduction of Chatto, we hope to solve a variety of challenges presented by collaborative data mining projects while improving research reproducibility. 2017-04-07T20:00:04Z 2017-04-07T20:00:04Z 2016-08 2016-08 Article http://purl.org/eprint/type/JournalArticle 1438-8871 http://hdl.handle.net/1721.1/107975 Celi, Leo Anthony et al. “Datathons and Software to Promote Reproducible Research.” Journal of Medical Internet Research 18.8 (2016): e230. https://orcid.org/0000-0002-2573-388X https://orcid.org/0000-0003-2150-1747 https://orcid.org/0000-0002-5676-7898 en_US http://dx.doi.org/10.2196/jmir.6365 Journal of Medical Internet Research Creative Commons Attribution 2.0 License http://www.creativecommons.org/licenses/by/2.0/ application/pdf Gunther Eysenbach, JMIR JMIR Publications
spellingShingle Celi, Leo Anthony G.
Lokhandwala, Sharukh
Montgomery, Robert
Moses, Christopher A
Pollard, Tom Joseph
Stretch, Robert
Spitz, Daniel
Naumann, Tristan Josef
Datathons and Software to Promote Reproducible Research
title Datathons and Software to Promote Reproducible Research
title_full Datathons and Software to Promote Reproducible Research
title_fullStr Datathons and Software to Promote Reproducible Research
title_full_unstemmed Datathons and Software to Promote Reproducible Research
title_short Datathons and Software to Promote Reproducible Research
title_sort datathons and software to promote reproducible research
url http://hdl.handle.net/1721.1/107975
https://orcid.org/0000-0002-2573-388X
https://orcid.org/0000-0003-2150-1747
https://orcid.org/0000-0002-5676-7898
work_keys_str_mv AT celileoanthonyg datathonsandsoftwaretopromotereproducibleresearch
AT lokhandwalasharukh datathonsandsoftwaretopromotereproducibleresearch
AT montgomeryrobert datathonsandsoftwaretopromotereproducibleresearch
AT moseschristophera datathonsandsoftwaretopromotereproducibleresearch
AT pollardtomjoseph datathonsandsoftwaretopromotereproducibleresearch
AT stretchrobert datathonsandsoftwaretopromotereproducibleresearch
AT spitzdaniel datathonsandsoftwaretopromotereproducibleresearch
AT naumanntristanjosef datathonsandsoftwaretopromotereproducibleresearch