A machine-learning method for biobank-scale genetic prediction of blood group antigens.

A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational...

Full description

Bibliographic Details
Main Authors: Kati Hyvärinen, Katri Haimila, Camous Moslemi, Blood Service Biobank, Martin L Olsson, Sisse R Ostrowski, Ole B Pedersen, Christian Erikstrup, Jukka Partanen, Jarmo Ritari
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-03-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1011977
_version_ 1827291862358032384
author Kati Hyvärinen
Katri Haimila
Camous Moslemi
Blood Service Biobank
Martin L Olsson
Sisse R Ostrowski
Ole B Pedersen
Christian Erikstrup
Jukka Partanen
Jarmo Ritari
author_facet Kati Hyvärinen
Katri Haimila
Camous Moslemi
Blood Service Biobank
Martin L Olsson
Sisse R Ostrowski
Ole B Pedersen
Christian Erikstrup
Jukka Partanen
Jarmo Ritari
author_sort Kati Hyvärinen
collection DOAJ
description A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.
first_indexed 2024-04-24T12:46:00Z
format Article
id doaj.art-9bf2af96ac444de7992edcf44801af34
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-04-24T12:46:00Z
publishDate 2024-03-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-9bf2af96ac444de7992edcf44801af342024-04-07T05:31:20ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582024-03-01203e101197710.1371/journal.pcbi.1011977A machine-learning method for biobank-scale genetic prediction of blood group antigens.Kati HyvärinenKatri HaimilaCamous MoslemiBlood Service BiobankMartin L OlssonSisse R OstrowskiOle B PedersenChristian ErikstrupJukka PartanenJarmo RitariA key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.https://doi.org/10.1371/journal.pcbi.1011977
spellingShingle Kati Hyvärinen
Katri Haimila
Camous Moslemi
Blood Service Biobank
Martin L Olsson
Sisse R Ostrowski
Ole B Pedersen
Christian Erikstrup
Jukka Partanen
Jarmo Ritari
A machine-learning method for biobank-scale genetic prediction of blood group antigens.
PLoS Computational Biology
title A machine-learning method for biobank-scale genetic prediction of blood group antigens.
title_full A machine-learning method for biobank-scale genetic prediction of blood group antigens.
title_fullStr A machine-learning method for biobank-scale genetic prediction of blood group antigens.
title_full_unstemmed A machine-learning method for biobank-scale genetic prediction of blood group antigens.
title_short A machine-learning method for biobank-scale genetic prediction of blood group antigens.
title_sort machine learning method for biobank scale genetic prediction of blood group antigens
url https://doi.org/10.1371/journal.pcbi.1011977
work_keys_str_mv AT katihyvarinen amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT katrihaimila amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT camousmoslemi amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT bloodservicebiobank amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT martinlolsson amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT sisserostrowski amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT olebpedersen amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT christianerikstrup amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT jukkapartanen amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT jarmoritari amachinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT katihyvarinen machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT katrihaimila machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT camousmoslemi machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT bloodservicebiobank machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT martinlolsson machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT sisserostrowski machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT olebpedersen machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT christianerikstrup machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT jukkapartanen machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens
AT jarmoritari machinelearningmethodforbiobankscalegeneticpredictionofbloodgroupantigens