tableone: An open source Python package for producing summary statistics for research papers

Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we...

Full description

Bibliographic Details
Main Authors: Pollard, Tom Joseph, Johnson, Alistair Edward William, Raffa, Jesse D, Mark, Roger G
Other Authors: Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Format: Article
Language:English
Published: Oxford University Press (OUP) 2020
Online Access:https://hdl.handle.net/1721.1/126562
_version_ 1826211056497721344
author Pollard, Tom Joseph
Johnson, Alistair Edward William
Raffa, Jesse D
Mark, Roger G
author2 Massachusetts Institute of Technology. Institute for Medical Engineering & Science
author_facet Massachusetts Institute of Technology. Institute for Medical Engineering & Science
Pollard, Tom Joseph
Johnson, Alistair Edward William
Raffa, Jesse D
Mark, Roger G
author_sort Pollard, Tom Joseph
collection MIT
description Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible methodfor providing summary statistics for research papers in the Python programming language. Second, we seek touse the package to improve the quality of summary statistics reported in research papers.Materials and Methods:Thetableonepackage is developed following good practice guidelines for scientificcomputing and all code is made available under a permissive MIT License. A testing framework runs on a con-tinuous integration server, helping to maintain code stability. Issues are tracked openly and public contributionsare encouraged.Results:Thetableonesoftware package automatically compiles summary statistics into publishable formatssuch as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to asubset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s DipTest for modality are computed to highlight potential issues in summarizing the data.Discussion and Conclusion:We present open source software for researchers to facilitate carrying out repro-ducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to ma-ture over time with community feedback and input. Development of a common tool for summarizing data mayhelp to promote good practice when used as a supplement to existing guidelines and recommendations. Weencourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization toensure appropriate data handling. We also suggest seeking guidance from a statistician when usingtableonefor a research study, especially prior to submitting the study for publication.
first_indexed 2024-09-23T14:59:52Z
format Article
id mit-1721.1/126562
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T14:59:52Z
publishDate 2020
publisher Oxford University Press (OUP)
record_format dspace
spelling mit-1721.1/1265622022-09-29T11:57:40Z tableone: An open source Python package for producing summary statistics for research papers Pollard, Tom Joseph Johnson, Alistair Edward William Raffa, Jesse D Mark, Roger G Massachusetts Institute of Technology. Institute for Medical Engineering & Science Harvard--MIT Program in Health Sciences and Technology. Laboratory for Computational Physiology Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible methodfor providing summary statistics for research papers in the Python programming language. Second, we seek touse the package to improve the quality of summary statistics reported in research papers.Materials and Methods:Thetableonepackage is developed following good practice guidelines for scientificcomputing and all code is made available under a permissive MIT License. A testing framework runs on a con-tinuous integration server, helping to maintain code stability. Issues are tracked openly and public contributionsare encouraged.Results:Thetableonesoftware package automatically compiles summary statistics into publishable formatssuch as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to asubset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s DipTest for modality are computed to highlight potential issues in summarizing the data.Discussion and Conclusion:We present open source software for researchers to facilitate carrying out repro-ducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to ma-ture over time with community feedback and input. Development of a common tool for summarizing data mayhelp to promote good practice when used as a supplement to existing guidelines and recommendations. Weencourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization toensure appropriate data handling. We also suggest seeking guidance from a statistician when usingtableonefor a research study, especially prior to submitting the study for publication. National Institutes of Health (U.S.) (Grant NIH-R01-EB017205) National Institutes of Health (U.S.) (Grant NIH-R01-EB001659) 2020-08-13T16:28:26Z 2020-08-13T16:28:26Z 2018-05 2018-03 2019-10-09T15:44:36Z Article http://purl.org/eprint/type/JournalArticle 2574-2531 https://hdl.handle.net/1721.1/126562 Pollard, Tom J. et al. “tableone: An open source Python package for producing summary statistics for research papers.” JAMIA open, vol. 1, no. 1, 2018, pp. 26-31 © 2018 The Author(s) en 10.1093/JAMIAOPEN/OOY012 JAMIA open Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Oxford University Press (OUP) Oxford University Press
spellingShingle Pollard, Tom Joseph
Johnson, Alistair Edward William
Raffa, Jesse D
Mark, Roger G
tableone: An open source Python package for producing summary statistics for research papers
title tableone: An open source Python package for producing summary statistics for research papers
title_full tableone: An open source Python package for producing summary statistics for research papers
title_fullStr tableone: An open source Python package for producing summary statistics for research papers
title_full_unstemmed tableone: An open source Python package for producing summary statistics for research papers
title_short tableone: An open source Python package for producing summary statistics for research papers
title_sort tableone an open source python package for producing summary statistics for research papers
url https://hdl.handle.net/1721.1/126562
work_keys_str_mv AT pollardtomjoseph tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers
AT johnsonalistairedwardwilliam tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers
AT raffajessed tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers
AT markrogerg tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers