tableone: An open source Python package for producing summary statistics for research papers
Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Oxford University Press (OUP)
2020
|
Online Access: | https://hdl.handle.net/1721.1/126562 |
_version_ | 1826211056497721344 |
---|---|
author | Pollard, Tom Joseph Johnson, Alistair Edward William Raffa, Jesse D Mark, Roger G |
author2 | Massachusetts Institute of Technology. Institute for Medical Engineering & Science |
author_facet | Massachusetts Institute of Technology. Institute for Medical Engineering & Science Pollard, Tom Joseph Johnson, Alistair Edward William Raffa, Jesse D Mark, Roger G |
author_sort | Pollard, Tom Joseph |
collection | MIT |
description | Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible methodfor providing summary statistics for research papers in the Python programming language. Second, we seek touse the package to improve the quality of summary statistics reported in research papers.Materials and Methods:Thetableonepackage is developed following good practice guidelines for scientificcomputing and all code is made available under a permissive MIT License. A testing framework runs on a con-tinuous integration server, helping to maintain code stability. Issues are tracked openly and public contributionsare encouraged.Results:Thetableonesoftware package automatically compiles summary statistics into publishable formatssuch as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to asubset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s DipTest for modality are computed to highlight potential issues in summarizing the data.Discussion and Conclusion:We present open source software for researchers to facilitate carrying out repro-ducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to ma-ture over time with community feedback and input. Development of a common tool for summarizing data mayhelp to promote good practice when used as a supplement to existing guidelines and recommendations. Weencourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization toensure appropriate data handling. We also suggest seeking guidance from a statistician when usingtableonefor a research study, especially prior to submitting the study for publication. |
first_indexed | 2024-09-23T14:59:52Z |
format | Article |
id | mit-1721.1/126562 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T14:59:52Z |
publishDate | 2020 |
publisher | Oxford University Press (OUP) |
record_format | dspace |
spelling | mit-1721.1/1265622022-09-29T11:57:40Z tableone: An open source Python package for producing summary statistics for research papers Pollard, Tom Joseph Johnson, Alistair Edward William Raffa, Jesse D Mark, Roger G Massachusetts Institute of Technology. Institute for Medical Engineering & Science Harvard--MIT Program in Health Sciences and Technology. Laboratory for Computational Physiology Objectives:In quantitative research, understanding basic parameters of the study population is key for interpre-tation of the results. As a result, it is typical for the first table (“Table 1”) of a research paper to include summarystatistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible methodfor providing summary statistics for research papers in the Python programming language. Second, we seek touse the package to improve the quality of summary statistics reported in research papers.Materials and Methods:Thetableonepackage is developed following good practice guidelines for scientificcomputing and all code is made available under a permissive MIT License. A testing framework runs on a con-tinuous integration server, helping to maintain code stability. Issues are tracked openly and public contributionsare encouraged.Results:Thetableonesoftware package automatically compiles summary statistics into publishable formatssuch as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to asubset of data from the MIMIC-III database. Tests such as Tukey’s rule for outlier detection and Hartigan’s DipTest for modality are computed to highlight potential issues in summarizing the data.Discussion and Conclusion:We present open source software for researchers to facilitate carrying out repro-ducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to ma-ture over time with community feedback and input. Development of a common tool for summarizing data mayhelp to promote good practice when used as a supplement to existing guidelines and recommendations. Weencourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization toensure appropriate data handling. We also suggest seeking guidance from a statistician when usingtableonefor a research study, especially prior to submitting the study for publication. National Institutes of Health (U.S.) (Grant NIH-R01-EB017205) National Institutes of Health (U.S.) (Grant NIH-R01-EB001659) 2020-08-13T16:28:26Z 2020-08-13T16:28:26Z 2018-05 2018-03 2019-10-09T15:44:36Z Article http://purl.org/eprint/type/JournalArticle 2574-2531 https://hdl.handle.net/1721.1/126562 Pollard, Tom J. et al. “tableone: An open source Python package for producing summary statistics for research papers.” JAMIA open, vol. 1, no. 1, 2018, pp. 26-31 © 2018 The Author(s) en 10.1093/JAMIAOPEN/OOY012 JAMIA open Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Oxford University Press (OUP) Oxford University Press |
spellingShingle | Pollard, Tom Joseph Johnson, Alistair Edward William Raffa, Jesse D Mark, Roger G tableone: An open source Python package for producing summary statistics for research papers |
title | tableone: An open source Python package for producing summary statistics for research papers |
title_full | tableone: An open source Python package for producing summary statistics for research papers |
title_fullStr | tableone: An open source Python package for producing summary statistics for research papers |
title_full_unstemmed | tableone: An open source Python package for producing summary statistics for research papers |
title_short | tableone: An open source Python package for producing summary statistics for research papers |
title_sort | tableone an open source python package for producing summary statistics for research papers |
url | https://hdl.handle.net/1721.1/126562 |
work_keys_str_mv | AT pollardtomjoseph tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT johnsonalistairedwardwilliam tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT raffajessed tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers AT markrogerg tableoneanopensourcepythonpackageforproducingsummarystatisticsforresearchpapers |