Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data

The high dimensionality of genotype data available for genomic evaluations has presented a motivation for developing strategies to identify subsets of markers capable of increasing the accuracy of predictions compared to the current commercial single nucleotide polymorphism (SNP) chips. In this simu...

Full description

Bibliographic Details
Main Authors: Ashley Ling, El Hamidi Hay, Samuel E. Aggrey, Romdhane Rekaya
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/13/11/2100
_version_ 1797465213034823680
author Ashley Ling
El Hamidi Hay
Samuel E. Aggrey
Romdhane Rekaya
author_facet Ashley Ling
El Hamidi Hay
Samuel E. Aggrey
Romdhane Rekaya
author_sort Ashley Ling
collection DOAJ
description The high dimensionality of genotype data available for genomic evaluations has presented a motivation for developing strategies to identify subsets of markers capable of increasing the accuracy of predictions compared to the current commercial single nucleotide polymorphism (SNP) chips. In this simulation study, an algorithm for combining statistics used in the preselection and prioritization of SNP markers from a high-density panel (1.3 million SNPs) into a composite “fuzzy” ranking score based on a Sugeno-type fuzzy inference system (FIS) was developed and evaluated for performance in preselection for genomic predictions. F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> scores, and <i>p</i>-values were evaluated as inputs for the FIS. The accuracy of genomic predictions for fuzzy-score-preselected panel sizes of 1–50 k SNPs ranged from −0.4–11.7 and −0.3–3.8% higher than F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> and <i>p</i>-value preselection, respectively. Though gains in prediction accuracies using only two inputs to the FIS were modest, preselection based on fuzzy scores yielded more accurate predictions than both F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> scores and <i>p</i>-values for the majority of evaluated panel sizes under all genetic architectures. FIS have the potential to aggregate information from multiple criteria that reflect SNP-trait associations and biological relevance in a flexible and efficient way to yield higher quality genomic predictions.
first_indexed 2024-03-09T18:19:18Z
format Article
id doaj.art-18eeb7e03c9b48e18f23db72c41a0903
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-09T18:19:18Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-18eeb7e03c9b48e18f23db72c41a09032023-11-24T08:26:18ZengMDPI AGGenes2073-44252022-11-011311210010.3390/genes13112100Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype DataAshley Ling0El Hamidi Hay1Samuel E. Aggrey2Romdhane Rekaya3USDA Agricultural Research Service, Fort Keogh Livestock and Range Research Laboratory, Miles City, MT 59301, USAUSDA Agricultural Research Service, Fort Keogh Livestock and Range Research Laboratory, Miles City, MT 59301, USADepartment of Poultry Science, The University of Georgia, Athens, GA 30602, USAInstitute of Bioinformatics, The University of Georgia, Athens, GA 30602, USAThe high dimensionality of genotype data available for genomic evaluations has presented a motivation for developing strategies to identify subsets of markers capable of increasing the accuracy of predictions compared to the current commercial single nucleotide polymorphism (SNP) chips. In this simulation study, an algorithm for combining statistics used in the preselection and prioritization of SNP markers from a high-density panel (1.3 million SNPs) into a composite “fuzzy” ranking score based on a Sugeno-type fuzzy inference system (FIS) was developed and evaluated for performance in preselection for genomic predictions. F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> scores, and <i>p</i>-values were evaluated as inputs for the FIS. The accuracy of genomic predictions for fuzzy-score-preselected panel sizes of 1–50 k SNPs ranged from −0.4–11.7 and −0.3–3.8% higher than F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> and <i>p</i>-value preselection, respectively. Though gains in prediction accuracies using only two inputs to the FIS were modest, preselection based on fuzzy scores yielded more accurate predictions than both F<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow></mrow><mi>ST</mi></msub></semantics></math></inline-formula> scores and <i>p</i>-values for the majority of evaluated panel sizes under all genetic architectures. FIS have the potential to aggregate information from multiple criteria that reflect SNP-trait associations and biological relevance in a flexible and efficient way to yield higher quality genomic predictions.https://www.mdpi.com/2073-4425/13/11/2100SNP preselectiongenomic predictionhigh-density genotypessequence genotypesfuzzy logicfuzzy inference
spellingShingle Ashley Ling
El Hamidi Hay
Samuel E. Aggrey
Romdhane Rekaya
Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
Genes
SNP preselection
genomic prediction
high-density genotypes
sequence genotypes
fuzzy logic
fuzzy inference
title Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
title_full Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
title_fullStr Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
title_full_unstemmed Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
title_short Fuzzy Logic as a Strategy for Combining Marker Statistics to Optimize Preselection of High-Density and Sequence Genotype Data
title_sort fuzzy logic as a strategy for combining marker statistics to optimize preselection of high density and sequence genotype data
topic SNP preselection
genomic prediction
high-density genotypes
sequence genotypes
fuzzy logic
fuzzy inference
url https://www.mdpi.com/2073-4425/13/11/2100
work_keys_str_mv AT ashleyling fuzzylogicasastrategyforcombiningmarkerstatisticstooptimizepreselectionofhighdensityandsequencegenotypedata
AT elhamidihay fuzzylogicasastrategyforcombiningmarkerstatisticstooptimizepreselectionofhighdensityandsequencegenotypedata
AT samueleaggrey fuzzylogicasastrategyforcombiningmarkerstatisticstooptimizepreselectionofhighdensityandsequencegenotypedata
AT romdhanerekaya fuzzylogicasastrategyforcombiningmarkerstatisticstooptimizepreselectionofhighdensityandsequencegenotypedata