Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study

Purpose This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods. Methods Patients in...

Full description

Bibliographic Details
Main Authors: Dilek Yapar, Aliekber Yapar, Mehmet Ali Tokgöz, Uğur Bilge
Format: Article
Language:English
Published: SAGE Publishing 2023-08-01
Series:Journal of Orthopaedic Surgery
Online Access:https://doi.org/10.1177/10225536231189780
_version_ 1827874942933270528
author Dilek Yapar
Aliekber Yapar
Mehmet Ali Tokgöz
Uğur Bilge
author_facet Dilek Yapar
Aliekber Yapar
Mehmet Ali Tokgöz
Uğur Bilge
author_sort Dilek Yapar
collection DOAJ
description Purpose This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods. Methods Patients included in this study were extracted from the National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) database: “Incidence-SEER Research Data, 18 Registries, Nov 2020 Sub”. Patients with unclassified and incomplete information were excluded. This search algorithm resulted in a dataset comprising 6234 cases. Survival analyses were performed with Kaplan-Meier curves and the Log-rank test. Multivariate Cox regression analysis determined the independent prognostic factors of PMBT. A decision tree-based data mining technique was used in this study to confirm the prognostic factors. Results 5-years survival rate was 63.6% and 10-years survival rate was 55.3% in the patients with PMBT. Sex, age, median household income, histology, primary site, grade, stage, metastasis, and the total number of malignant tumors were determined as independent risk factors associated with overall survival (OS) in the multivariate COX regression analysis. The prognostic factors resulting in five terminal nodes in the decision tree (DT) included stage, age, and grade. The stage was the most important determining factor for vital status. The terminal node with the shortest number of surviving patients included 801 (72.3%) deaths in 1102 patients with distant stage, and hazard ratio was calculated as 5.4 (95% CI: 4.9–5.9; p < .001). These patients had a median survival of only 17 months. Conclusions Rules extracted from DTs provide information about risk factors in specific patient groups and can be used by clinicians making decisions on individual patients. We recommend using DTs in combination with COX regression analysis to determine risk factors and the effect of these factors on survival.
first_indexed 2024-03-12T17:00:06Z
format Article
id doaj.art-bc5b49f6691843aa9c7216a3ee937e85
institution Directory Open Access Journal
issn 2309-4990
language English
last_indexed 2024-03-12T17:00:06Z
publishDate 2023-08-01
publisher SAGE Publishing
record_format Article
series Journal of Orthopaedic Surgery
spelling doaj.art-bc5b49f6691843aa9c7216a3ee937e852023-08-07T18:03:32ZengSAGE PublishingJournal of Orthopaedic Surgery2309-49902023-08-013110.1177/10225536231189780Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database studyDilek YaparAliekber YaparMehmet Ali TokgözUğur BilgePurpose This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods. Methods Patients included in this study were extracted from the National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) database: “Incidence-SEER Research Data, 18 Registries, Nov 2020 Sub”. Patients with unclassified and incomplete information were excluded. This search algorithm resulted in a dataset comprising 6234 cases. Survival analyses were performed with Kaplan-Meier curves and the Log-rank test. Multivariate Cox regression analysis determined the independent prognostic factors of PMBT. A decision tree-based data mining technique was used in this study to confirm the prognostic factors. Results 5-years survival rate was 63.6% and 10-years survival rate was 55.3% in the patients with PMBT. Sex, age, median household income, histology, primary site, grade, stage, metastasis, and the total number of malignant tumors were determined as independent risk factors associated with overall survival (OS) in the multivariate COX regression analysis. The prognostic factors resulting in five terminal nodes in the decision tree (DT) included stage, age, and grade. The stage was the most important determining factor for vital status. The terminal node with the shortest number of surviving patients included 801 (72.3%) deaths in 1102 patients with distant stage, and hazard ratio was calculated as 5.4 (95% CI: 4.9–5.9; p < .001). These patients had a median survival of only 17 months. Conclusions Rules extracted from DTs provide information about risk factors in specific patient groups and can be used by clinicians making decisions on individual patients. We recommend using DTs in combination with COX regression analysis to determine risk factors and the effect of these factors on survival.https://doi.org/10.1177/10225536231189780
spellingShingle Dilek Yapar
Aliekber Yapar
Mehmet Ali Tokgöz
Uğur Bilge
Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
Journal of Orthopaedic Surgery
title Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
title_full Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
title_fullStr Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
title_full_unstemmed Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
title_short Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study
title_sort decision tree based data mining approach for the evaluation of survival in primary malignant bone tumors a surveillance epidemiology and end results database study
url https://doi.org/10.1177/10225536231189780
work_keys_str_mv AT dilekyapar decisiontreebaseddataminingapproachfortheevaluationofsurvivalinprimarymalignantbonetumorsasurveillanceepidemiologyandendresultsdatabasestudy
AT aliekberyapar decisiontreebaseddataminingapproachfortheevaluationofsurvivalinprimarymalignantbonetumorsasurveillanceepidemiologyandendresultsdatabasestudy
AT mehmetalitokgoz decisiontreebaseddataminingapproachfortheevaluationofsurvivalinprimarymalignantbonetumorsasurveillanceepidemiologyandendresultsdatabasestudy
AT ugurbilge decisiontreebaseddataminingapproachfortheevaluationofsurvivalinprimarymalignantbonetumorsasurveillanceepidemiologyandendresultsdatabasestudy