HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty

Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer...

Full description

Bibliographic Details
Main Authors: Soheyla Amirian, Husam Ghazaleh, Luke A. Carlson, Matthew Gong, Logan Finger, Johannes F. Plate, Ahmad P. Tafti
Format: Article
Language:English
Published: Elsevier 2023-12-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340923008077
_version_ 1797429990697992192
author Soheyla Amirian
Husam Ghazaleh
Luke A. Carlson
Matthew Gong
Logan Finger
Johannes F. Plate
Ahmad P. Tafti
author_facet Soheyla Amirian
Husam Ghazaleh
Luke A. Carlson
Matthew Gong
Logan Finger
Johannes F. Plate
Ahmad P. Tafti
author_sort Soheyla Amirian
collection DOAJ
description Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer important insights into TJA, covering aspects like diagnosis, prevention, treatment strategies, and epidemiological factors. However, there has been limited effort to compile a large-scale text dataset from these articles and make it publicly available for open scientific research in TJA. Rapid yet, utilizing computational text analysis on these large columns of scientific literatures holds great potential for uncovering new knowledge to enhance our understanding of joint diseases and improve the quality of TJA care and clinical outcomes. This work aims to build a dataset entitled HexAI-TJAtxt, which includes more than 61,936 scientific abstracts collected from PubMed using MeSH (Medical Subject Headings) terms within “MeSH Subheading” and “MeSH Major Topic,” and Publication Date from 01/01/2000 to 12/31/2022. The current dataset is freely and publicly available at https://github.com/pitthexai/HexAI-TJAtxt, and it will be updated frequently in bi-monthly manner from new abstracts published at PubMed.
first_indexed 2024-03-09T09:21:15Z
format Article
id doaj.art-819379a442734523b7bec95df8bbd36f
institution Directory Open Access Journal
issn 2352-3409
language English
last_indexed 2024-03-09T09:21:15Z
publishDate 2023-12-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj.art-819379a442734523b7bec95df8bbd36f2023-12-02T07:00:16ZengElsevierData in Brief2352-34092023-12-0151109738HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplastySoheyla Amirian0Husam Ghazaleh1Luke A. Carlson2Matthew Gong3Logan Finger4Johannes F. Plate5Ahmad P. Tafti6School of Computing, University of Georgia, Athens, GA, USADepartment of Computer Science, Quincy University, Quincy, IL, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Health Information Management, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA; Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, Pittsburgh, PA, USA; Corresponding author at: Department of Health Information Management, School of Health and Rehabilitation Sciences, University of Pittsburgh, 6030 Forbes Tower, Pittsburgh, PA 15260, USA.Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer important insights into TJA, covering aspects like diagnosis, prevention, treatment strategies, and epidemiological factors. However, there has been limited effort to compile a large-scale text dataset from these articles and make it publicly available for open scientific research in TJA. Rapid yet, utilizing computational text analysis on these large columns of scientific literatures holds great potential for uncovering new knowledge to enhance our understanding of joint diseases and improve the quality of TJA care and clinical outcomes. This work aims to build a dataset entitled HexAI-TJAtxt, which includes more than 61,936 scientific abstracts collected from PubMed using MeSH (Medical Subject Headings) terms within “MeSH Subheading” and “MeSH Major Topic,” and Publication Date from 01/01/2000 to 12/31/2022. The current dataset is freely and publicly available at https://github.com/pitthexai/HexAI-TJAtxt, and it will be updated frequently in bi-monthly manner from new abstracts published at PubMed.http://www.sciencedirect.com/science/article/pii/S2352340923008077Total joint arthroplastyLarge scale textual datasetComputational text analyticsChatGPT
spellingShingle Soheyla Amirian
Husam Ghazaleh
Luke A. Carlson
Matthew Gong
Logan Finger
Johannes F. Plate
Ahmad P. Tafti
HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
Data in Brief
Total joint arthroplasty
Large scale textual dataset
Computational text analytics
ChatGPT
title HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
title_full HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
title_fullStr HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
title_full_unstemmed HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
title_short HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
title_sort hexai tjatxt a textual dataset to advance open scientific research in total joint arthroplasty
topic Total joint arthroplasty
Large scale textual dataset
Computational text analytics
ChatGPT
url http://www.sciencedirect.com/science/article/pii/S2352340923008077
work_keys_str_mv AT soheylaamirian hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT husamghazaleh hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT lukeacarlson hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT matthewgong hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT loganfinger hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT johannesfplate hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty
AT ahmadptafti hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty