HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty
Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-12-01
|
Series: | Data in Brief |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340923008077 |
_version_ | 1797429990697992192 |
---|---|
author | Soheyla Amirian Husam Ghazaleh Luke A. Carlson Matthew Gong Logan Finger Johannes F. Plate Ahmad P. Tafti |
author_facet | Soheyla Amirian Husam Ghazaleh Luke A. Carlson Matthew Gong Logan Finger Johannes F. Plate Ahmad P. Tafti |
author_sort | Soheyla Amirian |
collection | DOAJ |
description | Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer important insights into TJA, covering aspects like diagnosis, prevention, treatment strategies, and epidemiological factors. However, there has been limited effort to compile a large-scale text dataset from these articles and make it publicly available for open scientific research in TJA. Rapid yet, utilizing computational text analysis on these large columns of scientific literatures holds great potential for uncovering new knowledge to enhance our understanding of joint diseases and improve the quality of TJA care and clinical outcomes. This work aims to build a dataset entitled HexAI-TJAtxt, which includes more than 61,936 scientific abstracts collected from PubMed using MeSH (Medical Subject Headings) terms within “MeSH Subheading” and “MeSH Major Topic,” and Publication Date from 01/01/2000 to 12/31/2022. The current dataset is freely and publicly available at https://github.com/pitthexai/HexAI-TJAtxt, and it will be updated frequently in bi-monthly manner from new abstracts published at PubMed. |
first_indexed | 2024-03-09T09:21:15Z |
format | Article |
id | doaj.art-819379a442734523b7bec95df8bbd36f |
institution | Directory Open Access Journal |
issn | 2352-3409 |
language | English |
last_indexed | 2024-03-09T09:21:15Z |
publishDate | 2023-12-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj.art-819379a442734523b7bec95df8bbd36f2023-12-02T07:00:16ZengElsevierData in Brief2352-34092023-12-0151109738HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplastySoheyla Amirian0Husam Ghazaleh1Luke A. Carlson2Matthew Gong3Logan Finger4Johannes F. Plate5Ahmad P. Tafti6School of Computing, University of Georgia, Athens, GA, USADepartment of Computer Science, Quincy University, Quincy, IL, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Orthopaedic Surgery, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USADepartment of Health Information Management, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA; Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, Pittsburgh, PA, USA; Corresponding author at: Department of Health Information Management, School of Health and Rehabilitation Sciences, University of Pittsburgh, 6030 Forbes Tower, Pittsburgh, PA 15260, USA.Total joint arthroplasty (TJA) is the most common and fastest inpatient surgical procedure in the elderly, nationwide. Due to the increasing number of TJA patients and advancements in healthcare, there is a growing number of scientific articles being published in a daily basis. These articles offer important insights into TJA, covering aspects like diagnosis, prevention, treatment strategies, and epidemiological factors. However, there has been limited effort to compile a large-scale text dataset from these articles and make it publicly available for open scientific research in TJA. Rapid yet, utilizing computational text analysis on these large columns of scientific literatures holds great potential for uncovering new knowledge to enhance our understanding of joint diseases and improve the quality of TJA care and clinical outcomes. This work aims to build a dataset entitled HexAI-TJAtxt, which includes more than 61,936 scientific abstracts collected from PubMed using MeSH (Medical Subject Headings) terms within “MeSH Subheading” and “MeSH Major Topic,” and Publication Date from 01/01/2000 to 12/31/2022. The current dataset is freely and publicly available at https://github.com/pitthexai/HexAI-TJAtxt, and it will be updated frequently in bi-monthly manner from new abstracts published at PubMed.http://www.sciencedirect.com/science/article/pii/S2352340923008077Total joint arthroplastyLarge scale textual datasetComputational text analyticsChatGPT |
spellingShingle | Soheyla Amirian Husam Ghazaleh Luke A. Carlson Matthew Gong Logan Finger Johannes F. Plate Ahmad P. Tafti HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty Data in Brief Total joint arthroplasty Large scale textual dataset Computational text analytics ChatGPT |
title | HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty |
title_full | HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty |
title_fullStr | HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty |
title_full_unstemmed | HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty |
title_short | HexAI-TJAtxt: A textual dataset to advance open scientific research in total joint arthroplasty |
title_sort | hexai tjatxt a textual dataset to advance open scientific research in total joint arthroplasty |
topic | Total joint arthroplasty Large scale textual dataset Computational text analytics ChatGPT |
url | http://www.sciencedirect.com/science/article/pii/S2352340923008077 |
work_keys_str_mv | AT soheylaamirian hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT husamghazaleh hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT lukeacarlson hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT matthewgong hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT loganfinger hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT johannesfplate hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty AT ahmadptafti hexaitjatxtatextualdatasettoadvanceopenscientificresearchintotaljointarthroplasty |