Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?

Converting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and comp...

Full description

Bibliographic Details
Main Authors: Muhammad Muneeb, Samuel F. Feng, Andreas Henschel
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-06-01
Series:Frontiers in Bioinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbinf.2022.914435/full
_version_ 1811344389933367296
author Muhammad Muneeb
Muhammad Muneeb
Samuel F. Feng
Samuel F. Feng
Andreas Henschel
Andreas Henschel
author_facet Muhammad Muneeb
Muhammad Muneeb
Samuel F. Feng
Samuel F. Feng
Andreas Henschel
Andreas Henschel
author_sort Muhammad Muneeb
collection DOAJ
description Converting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and compared the results with the one-dimensional convolutional neural network. Surprisingly, the average accuracy of multiple runs of 2DCNN was 0.86, and that of 1DCNN was 0.89, yielding a difference of 0.03, which suggests that even the 2DCNN algorithm works on genotype sequences. Moreover, the results generated by the 2DCNN exhibited less variation than those generated by the 1DCNN, thereby offering greater stability. The purpose of this study is to draw the research community’s attention to explore encoding schemes for genotype data and machine learning algorithms that can be used on genotype data by changing the representation of the genotype data for case/control classification.
first_indexed 2024-04-13T19:46:35Z
format Article
id doaj.art-b03d93d31399480285d24ad6c2a5c392
institution Directory Open Access Journal
issn 2673-7647
language English
last_indexed 2024-04-13T19:46:35Z
publishDate 2022-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Bioinformatics
spelling doaj.art-b03d93d31399480285d24ad6c2a5c3922022-12-22T02:32:43ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472022-06-01210.3389/fbinf.2022.914435914435Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?Muhammad Muneeb0Muhammad Muneeb1Samuel F. Feng2Samuel F. Feng3Andreas Henschel4Andreas Henschel5Department of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesDepartment of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesDepartment of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesResearch and Data Intelligence Support Center R-DISC, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesDepartment of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesResearch and Data Intelligence Support Center R-DISC, Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesConverting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and compared the results with the one-dimensional convolutional neural network. Surprisingly, the average accuracy of multiple runs of 2DCNN was 0.86, and that of 1DCNN was 0.89, yielding a difference of 0.03, which suggests that even the 2DCNN algorithm works on genotype sequences. Moreover, the results generated by the 2DCNN exhibited less variation than those generated by the 1DCNN, thereby offering greater stability. The purpose of this study is to draw the research community’s attention to explore encoding schemes for genotype data and machine learning algorithms that can be used on genotype data by changing the representation of the genotype data for case/control classification.https://www.frontiersin.org/articles/10.3389/fbinf.2022.914435/fullgenotype-phenotype predictiongeneticsbioinformaticsapplied machine learningimage classification
spellingShingle Muhammad Muneeb
Muhammad Muneeb
Samuel F. Feng
Samuel F. Feng
Andreas Henschel
Andreas Henschel
Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
Frontiers in Bioinformatics
genotype-phenotype prediction
genetics
bioinformatics
applied machine learning
image classification
title Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
title_full Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
title_fullStr Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
title_full_unstemmed Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
title_short Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
title_sort can we convert genotype sequences into images for cases controls classification
topic genotype-phenotype prediction
genetics
bioinformatics
applied machine learning
image classification
url https://www.frontiersin.org/articles/10.3389/fbinf.2022.914435/full
work_keys_str_mv AT muhammadmuneeb canweconvertgenotypesequencesintoimagesforcasescontrolsclassification
AT muhammadmuneeb canweconvertgenotypesequencesintoimagesforcasescontrolsclassification
AT samuelffeng canweconvertgenotypesequencesintoimagesforcasescontrolsclassification
AT samuelffeng canweconvertgenotypesequencesintoimagesforcasescontrolsclassification
AT andreashenschel canweconvertgenotypesequencesintoimagesforcasescontrolsclassification
AT andreashenschel canweconvertgenotypesequencesintoimagesforcasescontrolsclassification