Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis

<p>Abstract</p> <p>Background</p> <p>A common approach in exploring register data is to find relationships between outcomes and predictors by using multiple regression analysis (MRA). If there is more than one outcome variable, the analysis must then be repeated, and th...

Full description

Bibliographic Details
Main Authors: Thorstenson Sten, Sundquist Marie, Stål Olle, Gill Hans, Razavi Amir R, Åhlfeldt Hans, Shahsavar Nosrat
Format: Article
Language:English
Published: BMC 2005-08-01
Series:BMC Medical Informatics and Decision Making
Online Access:http://www.biomedcentral.com/1472-6947/5/29
_version_ 1818366668731056128
author Thorstenson Sten
Sundquist Marie
Stål Olle
Gill Hans
Razavi Amir R
Åhlfeldt Hans
Shahsavar Nosrat
author_facet Thorstenson Sten
Sundquist Marie
Stål Olle
Gill Hans
Razavi Amir R
Åhlfeldt Hans
Shahsavar Nosrat
author_sort Thorstenson Sten
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>A common approach in exploring register data is to find relationships between outcomes and predictors by using multiple regression analysis (MRA). If there is more than one outcome variable, the analysis must then be repeated, and the results combined in some arbitrary fashion. In contrast, Canonical Correlation Analysis (CCA) has the ability to analyze multiple outcomes at the same time.</p> <p>One essential outcome after breast cancer treatment is recurrence of the disease. It is important to understand the relationship between different predictors and recurrence, including the time interval until recurrence. This study describes the application of CCA to find important predictors for two different outcomes for breast cancer patients, loco-regional recurrence and occurrence of distant metastasis and to decrease the number of variables in the sets of predictors and outcomes without decreasing the predictive strength of the model.</p> <p>Methods</p> <p>Data for 637 malignant breast cancer patients admitted in the south-east region of Sweden were analyzed. By using CCA and looking at the structure coefficients (loadings), relationships between tumor specifications and the two outcomes during different time intervals were analyzed and a correlation model was built.</p> <p>Results</p> <p>The analysis successfully detected known predictors for breast cancer recurrence during the first two years and distant metastasis 2–4 years after diagnosis. Nottingham Histologic Grading (NHG) was the most important predictor, while age of the patient at the time of diagnosis was not an important predictor.</p> <p>Conclusion</p> <p>In cancer registers with high dimensionality, CCA can be used for identifying the importance of risk factors for breast cancer recurrence. This technique can result in a model ready for further processing by data mining methods through reducing the number of variables to important ones.</p>
first_indexed 2024-12-13T22:39:49Z
format Article
id doaj.art-777dcef9fd8b42228cd724cd2c122c1c
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-12-13T22:39:49Z
publishDate 2005-08-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-777dcef9fd8b42228cd724cd2c122c1c2022-12-21T23:28:53ZengBMCBMC Medical Informatics and Decision Making1472-69472005-08-01512910.1186/1472-6947-5-29Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation AnalysisThorstenson StenSundquist MarieStål OlleGill HansRazavi Amir RÅhlfeldt HansShahsavar Nosrat<p>Abstract</p> <p>Background</p> <p>A common approach in exploring register data is to find relationships between outcomes and predictors by using multiple regression analysis (MRA). If there is more than one outcome variable, the analysis must then be repeated, and the results combined in some arbitrary fashion. In contrast, Canonical Correlation Analysis (CCA) has the ability to analyze multiple outcomes at the same time.</p> <p>One essential outcome after breast cancer treatment is recurrence of the disease. It is important to understand the relationship between different predictors and recurrence, including the time interval until recurrence. This study describes the application of CCA to find important predictors for two different outcomes for breast cancer patients, loco-regional recurrence and occurrence of distant metastasis and to decrease the number of variables in the sets of predictors and outcomes without decreasing the predictive strength of the model.</p> <p>Methods</p> <p>Data for 637 malignant breast cancer patients admitted in the south-east region of Sweden were analyzed. By using CCA and looking at the structure coefficients (loadings), relationships between tumor specifications and the two outcomes during different time intervals were analyzed and a correlation model was built.</p> <p>Results</p> <p>The analysis successfully detected known predictors for breast cancer recurrence during the first two years and distant metastasis 2–4 years after diagnosis. Nottingham Histologic Grading (NHG) was the most important predictor, while age of the patient at the time of diagnosis was not an important predictor.</p> <p>Conclusion</p> <p>In cancer registers with high dimensionality, CCA can be used for identifying the importance of risk factors for breast cancer recurrence. This technique can result in a model ready for further processing by data mining methods through reducing the number of variables to important ones.</p>http://www.biomedcentral.com/1472-6947/5/29
spellingShingle Thorstenson Sten
Sundquist Marie
Stål Olle
Gill Hans
Razavi Amir R
Åhlfeldt Hans
Shahsavar Nosrat
Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
BMC Medical Informatics and Decision Making
title Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
title_full Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
title_fullStr Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
title_full_unstemmed Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
title_short Exploring cancer register data to find risk factors for recurrence of breast cancer – application of Canonical Correlation Analysis
title_sort exploring cancer register data to find risk factors for recurrence of breast cancer application of canonical correlation analysis
url http://www.biomedcentral.com/1472-6947/5/29
work_keys_str_mv AT thorstensonsten exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT sundquistmarie exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT stalolle exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT gillhans exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT razaviamirr exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT ahlfeldthans exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis
AT shahsavarnosrat exploringcancerregisterdatatofindriskfactorsforrecurrenceofbreastcancerapplicationofcanonicalcorrelationanalysis