The effect of sample size on polygenic hazard models for prostate cancer
We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP...
Autors principals: | , , , , , , , , |
---|---|
Format: | Journal article |
Idioma: | English |
Publicat: |
Springer Nature
2020
|
_version_ | 1826258351869132800 |
---|---|
author | Karunamuni, RA Huynh-Le, M-P Fan, CC Key, TJ Travis, RC Neal, DE Hamdy, FC Mills, IG The PRACTICAL Consortium |
author_facet | Karunamuni, RA Huynh-Le, M-P Fan, CC Key, TJ Travis, RC Neal, DE Hamdy, FC Mills, IG The PRACTICAL Consortium |
author_sort | Karunamuni, RA |
collection | OXFORD |
description | We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated for random sizes of the training set. The performance of a representative Established-SNP model was estimated for random sizes of the testing set. Mean HR98/50 (hazard ratio of top 2% to average in test set) of the Established-SNP model increased from 1.73 [95% CI: 1.69–1.77] to 2.41 [2.40–2.43] when the number of training samples was increased from 1 thousand to 30 thousand. Corresponding HR98/50 of the Discovery-SNP model increased from 1.05 [0.93–1.18] to 2.19 [2.16–2.23]. HR98/50 of a representative Established-SNP model using testing set sample sizes of 0.6 thousand and 6 thousand observations were 1.78 [1.70–1.85] and 1.73 [1.71–1.76], respectively. We estimate that a study population of 20 thousand men is required to develop Discovery-SNP PHS models while 10 thousand men should be sufficient for Established-SNP models. |
first_indexed | 2024-03-06T18:32:38Z |
format | Journal article |
id | oxford-uuid:0a2fa1ba-e2c2-4a89-a35f-27e1ace01477 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-06T18:32:38Z |
publishDate | 2020 |
publisher | Springer Nature |
record_format | dspace |
spelling | oxford-uuid:0a2fa1ba-e2c2-4a89-a35f-27e1ace014772022-03-26T09:22:24ZThe effect of sample size on polygenic hazard models for prostate cancerJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:0a2fa1ba-e2c2-4a89-a35f-27e1ace01477EnglishSymplectic ElementsSpringer Nature2020Karunamuni, RAHuynh-Le, M-PFan, CCKey, TJTravis, RCNeal, DEHamdy, FCMills, IGThe PRACTICAL ConsortiumWe determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated for random sizes of the training set. The performance of a representative Established-SNP model was estimated for random sizes of the testing set. Mean HR98/50 (hazard ratio of top 2% to average in test set) of the Established-SNP model increased from 1.73 [95% CI: 1.69–1.77] to 2.41 [2.40–2.43] when the number of training samples was increased from 1 thousand to 30 thousand. Corresponding HR98/50 of the Discovery-SNP model increased from 1.05 [0.93–1.18] to 2.19 [2.16–2.23]. HR98/50 of a representative Established-SNP model using testing set sample sizes of 0.6 thousand and 6 thousand observations were 1.78 [1.70–1.85] and 1.73 [1.71–1.76], respectively. We estimate that a study population of 20 thousand men is required to develop Discovery-SNP PHS models while 10 thousand men should be sufficient for Established-SNP models. |
spellingShingle | Karunamuni, RA Huynh-Le, M-P Fan, CC Key, TJ Travis, RC Neal, DE Hamdy, FC Mills, IG The PRACTICAL Consortium The effect of sample size on polygenic hazard models for prostate cancer |
title | The effect of sample size on polygenic hazard models for prostate cancer |
title_full | The effect of sample size on polygenic hazard models for prostate cancer |
title_fullStr | The effect of sample size on polygenic hazard models for prostate cancer |
title_full_unstemmed | The effect of sample size on polygenic hazard models for prostate cancer |
title_short | The effect of sample size on polygenic hazard models for prostate cancer |
title_sort | effect of sample size on polygenic hazard models for prostate cancer |
work_keys_str_mv | AT karunamunira theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT huynhlemp theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT fancc theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT keytj theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT travisrc theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT nealde theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT hamdyfc theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT millsig theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT thepracticalconsortium theeffectofsamplesizeonpolygenichazardmodelsforprostatecancer AT karunamunira effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT huynhlemp effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT fancc effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT keytj effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT travisrc effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT nealde effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT hamdyfc effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT millsig effectofsamplesizeonpolygenichazardmodelsforprostatecancer AT thepracticalconsortium effectofsamplesizeonpolygenichazardmodelsforprostatecancer |