Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation

BackgroundPopulation-based cancer survival estimates can provide insight into the real-world impacts of healthcare interventions and preventive services. However, estimation of survival rates obtained from population-based cancer registries can be biased due to missed incidence or incomplete vital s...

Full description

Bibliographic Details
Main Authors: Stefan Dahm, Benjamin Barnes, Klaus Kraywinkel
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-03-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2023.1088657/full
_version_ 1811157727601229824
author Stefan Dahm
Benjamin Barnes
Klaus Kraywinkel
author_facet Stefan Dahm
Benjamin Barnes
Klaus Kraywinkel
author_sort Stefan Dahm
collection DOAJ
description BackgroundPopulation-based cancer survival estimates can provide insight into the real-world impacts of healthcare interventions and preventive services. However, estimation of survival rates obtained from population-based cancer registries can be biased due to missed incidence or incomplete vital status data. Long-term survival estimates in particular are prone to overestimation, since the proportion of deaths that are missed, for example through unregistered emigration, increases with follow-up time. This also applies to registry-based long-term prevalence estimates. The aim of this report is to introduce a method to detect missed deaths within cancer registry data such that long-term survival of cancer patients does not exceed survival in the general population.MethodsWe analyzed data from 15 German epidemiologic cancer registries covering the years 1970-2016 and from Surveillance, Epidemiology, and End Results (SEER)-18 registries covering 1975-2015. The method is based on comparing survival times until exit (death or follow-up end) and ages at exit between deceased patients and surviving patients, stratified by diagnosis group, sex, age group and stage. Deceased patients with both follow-up time and age at exit in the highest percentile were regarded as outliers and used to fit a logistic regression. The regression was then used to classify each surviving patient as a survivor or a missed death. The procedure was repeated for lower percentile thresholds regarding deceased persons until long-term survival rates no longer exceeded the survival rates in the general population.ResultsFor the German cancer registry data, 0.9% of total deaths were classified as having been missed. Excluding these missed deaths reduced 20-year relative survival estimates for all cancers combined from 140% to 51%. For the whites in SEER data, classified missed deaths amounted to 0.02% of total deaths, resulting in 0.4 percent points lower 20-year relative survival rate for all cancers combined.ConclusionThe method described here classified a relatively small proportion of missed deaths yet reduced long-term survival estimates to more plausible levels. The effects of missed deaths should be considered when calculating long-term survival or prevalence estimates.
first_indexed 2024-04-10T05:11:06Z
format Article
id doaj.art-fced312a961e4151b459ccca28edf920
institution Directory Open Access Journal
issn 2234-943X
language English
last_indexed 2024-04-10T05:11:06Z
publishDate 2023-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Oncology
spelling doaj.art-fced312a961e4151b459ccca28edf9202023-03-09T07:28:46ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2023-03-011310.3389/fonc.2023.10886571088657Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimationStefan DahmBenjamin BarnesKlaus KraywinkelBackgroundPopulation-based cancer survival estimates can provide insight into the real-world impacts of healthcare interventions and preventive services. However, estimation of survival rates obtained from population-based cancer registries can be biased due to missed incidence or incomplete vital status data. Long-term survival estimates in particular are prone to overestimation, since the proportion of deaths that are missed, for example through unregistered emigration, increases with follow-up time. This also applies to registry-based long-term prevalence estimates. The aim of this report is to introduce a method to detect missed deaths within cancer registry data such that long-term survival of cancer patients does not exceed survival in the general population.MethodsWe analyzed data from 15 German epidemiologic cancer registries covering the years 1970-2016 and from Surveillance, Epidemiology, and End Results (SEER)-18 registries covering 1975-2015. The method is based on comparing survival times until exit (death or follow-up end) and ages at exit between deceased patients and surviving patients, stratified by diagnosis group, sex, age group and stage. Deceased patients with both follow-up time and age at exit in the highest percentile were regarded as outliers and used to fit a logistic regression. The regression was then used to classify each surviving patient as a survivor or a missed death. The procedure was repeated for lower percentile thresholds regarding deceased persons until long-term survival rates no longer exceeded the survival rates in the general population.ResultsFor the German cancer registry data, 0.9% of total deaths were classified as having been missed. Excluding these missed deaths reduced 20-year relative survival estimates for all cancers combined from 140% to 51%. For the whites in SEER data, classified missed deaths amounted to 0.02% of total deaths, resulting in 0.4 percent points lower 20-year relative survival rate for all cancers combined.ConclusionThe method described here classified a relatively small proportion of missed deaths yet reduced long-term survival estimates to more plausible levels. The effects of missed deaths should be considered when calculating long-term survival or prevalence estimates.https://www.frontiersin.org/articles/10.3389/fonc.2023.1088657/fullcancer registry datamissed deathslong-term survivalclassification algorithmrelative survival
spellingShingle Stefan Dahm
Benjamin Barnes
Klaus Kraywinkel
Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
Frontiers in Oncology
cancer registry data
missed deaths
long-term survival
classification algorithm
relative survival
title Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
title_full Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
title_fullStr Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
title_full_unstemmed Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
title_short Detection of missed deaths in cancer registry data to reduce bias in long-term survival estimation
title_sort detection of missed deaths in cancer registry data to reduce bias in long term survival estimation
topic cancer registry data
missed deaths
long-term survival
classification algorithm
relative survival
url https://www.frontiersin.org/articles/10.3389/fonc.2023.1088657/full
work_keys_str_mv AT stefandahm detectionofmisseddeathsincancerregistrydatatoreducebiasinlongtermsurvivalestimation
AT benjaminbarnes detectionofmisseddeathsincancerregistrydatatoreducebiasinlongtermsurvivalestimation
AT klauskraywinkel detectionofmisseddeathsincancerregistrydatatoreducebiasinlongtermsurvivalestimation