Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding

[Purpose/Significance] Advances in single-cell sequencing and high-throughput technology have made it possible for plant genomics to accumulate large quantities of data describing multidimensional genomic-wide molecular phenotypes at low cost. As powerful data mining tools, deep learning techniques...

Full description

Bibliographic Details
Main Author: HOU Xiangying, CUI Yunpeng, LIU Juan
Format: Article
Language:zho
Published: Editorial Department of Journal of Library and Information Science in Agriculture 2022-08-01
Series:Nongye tushu qingbao xuebao
Subjects:
Online Access:http://nytsqb.aiijournal.com/fileup/1002-1248/PDF/1002-1248-2022-34-8-4.pdf
_version_ 1811232695615750144
author HOU Xiangying, CUI Yunpeng, LIU Juan
author_facet HOU Xiangying, CUI Yunpeng, LIU Juan
author_sort HOU Xiangying, CUI Yunpeng, LIU Juan
collection DOAJ
description [Purpose/Significance] Advances in single-cell sequencing and high-throughput technology have made it possible for plant genomics to accumulate large quantities of data describing multidimensional genomic-wide molecular phenotypes at low cost. As powerful data mining tools, deep learning techniques can be utilized to further predict and interpret the acquired molecular phenotypes. In recent studies, deep learning has been shown to yield significant results in plant genomics and crop breeding research. However, a complete review of deep learning applications in plant genomics is lacking. [Method/Process] The input to deep learning applied to genomics is usually biological sequences and molecular phenotypes as predictor and target variables, respectively. We introduced the workflow from four views: input data pre-processing includes retrieval, coding, and splitting; model construction and training includes the selection of model architecture and hyperparameters; model evaluation and interpretability. Specifically, this paper introduces the background of deep learning approaches, including the latest graph neural networks; then it discusses two prominent issues in the intersection of genomics and deep learning with respect to gene characterization and protein characterization: 1) how to model the flow of information from plant genomic DNA sequences to molecular phenotypes; and 2) how deep learning models can be utilized to identify functional variation in natural populations? Specifically, the paper summarizes the current status of deep learning applications in related fields, which include deep learning and DNA and gene characterization research, interpretability of deep learning in genomics applications, graph neural networks in genomics, deep learning and genomic variation research, deep learning in protein prediction, ALPHAFOLD in protein prediction, deep learning and crop breeding research, and unsupervised learning in genomics and protein characterization. [Results/Conclusions] This article summarizes how traditional deep-learning algorithms, graph deep-learning, generative adversarial networks and interpretable AI are applied in current research in order to address these two problems. Finally, the prospects for deep learning in future plant genomics research and crop improvement are discussed. Overall, deep learning has provided better results than conventional methods in many genomics research directions, and the application of deep learning in genomics has yielded early applications of scientific and economic significance. Deep learning offers two distinct advantages: 1) end-to-end learning, with the ability to integrate multiple pre-processing steps into a single model; and 2) multimodal data processing capabilities that can handle extremely heterogeneous data in genomics. The advancement of deep learning has the potential to expand new research perspectives in genomics and crop breeding, and to facilitate larger-scale association studies in both phenotypic and genotypic genomics as algorithms become more accurate.
first_indexed 2024-04-12T11:07:16Z
format Article
id doaj.art-d8c0c6d8cd6b4b40a91bc88d713c6f3e
institution Directory Open Access Journal
issn 1002-1248
language zho
last_indexed 2024-04-12T11:07:16Z
publishDate 2022-08-01
publisher Editorial Department of Journal of Library and Information Science in Agriculture
record_format Article
series Nongye tushu qingbao xuebao
spelling doaj.art-d8c0c6d8cd6b4b40a91bc88d713c6f3e2022-12-22T03:35:42ZzhoEditorial Department of Journal of Library and Information Science in AgricultureNongye tushu qingbao xuebao1002-12482022-08-0134841810.13998/j.cnki.issn1002-1248.22-0101Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop BreedingHOU Xiangying, CUI Yunpeng, LIU Juan01. Zibo Academy of Agricultural Sciences, Zibo 255020; ;2. Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100081[Purpose/Significance] Advances in single-cell sequencing and high-throughput technology have made it possible for plant genomics to accumulate large quantities of data describing multidimensional genomic-wide molecular phenotypes at low cost. As powerful data mining tools, deep learning techniques can be utilized to further predict and interpret the acquired molecular phenotypes. In recent studies, deep learning has been shown to yield significant results in plant genomics and crop breeding research. However, a complete review of deep learning applications in plant genomics is lacking. [Method/Process] The input to deep learning applied to genomics is usually biological sequences and molecular phenotypes as predictor and target variables, respectively. We introduced the workflow from four views: input data pre-processing includes retrieval, coding, and splitting; model construction and training includes the selection of model architecture and hyperparameters; model evaluation and interpretability. Specifically, this paper introduces the background of deep learning approaches, including the latest graph neural networks; then it discusses two prominent issues in the intersection of genomics and deep learning with respect to gene characterization and protein characterization: 1) how to model the flow of information from plant genomic DNA sequences to molecular phenotypes; and 2) how deep learning models can be utilized to identify functional variation in natural populations? Specifically, the paper summarizes the current status of deep learning applications in related fields, which include deep learning and DNA and gene characterization research, interpretability of deep learning in genomics applications, graph neural networks in genomics, deep learning and genomic variation research, deep learning in protein prediction, ALPHAFOLD in protein prediction, deep learning and crop breeding research, and unsupervised learning in genomics and protein characterization. [Results/Conclusions] This article summarizes how traditional deep-learning algorithms, graph deep-learning, generative adversarial networks and interpretable AI are applied in current research in order to address these two problems. Finally, the prospects for deep learning in future plant genomics research and crop improvement are discussed. Overall, deep learning has provided better results than conventional methods in many genomics research directions, and the application of deep learning in genomics has yielded early applications of scientific and economic significance. Deep learning offers two distinct advantages: 1) end-to-end learning, with the ability to integrate multiple pre-processing steps into a single model; and 2) multimodal data processing capabilities that can handle extremely heterogeneous data in genomics. The advancement of deep learning has the potential to expand new research perspectives in genomics and crop breeding, and to facilitate larger-scale association studies in both phenotypic and genotypic genomics as algorithms become more accurate.http://nytsqb.aiijournal.com/fileup/1002-1248/PDF/1002-1248-2022-34-8-4.pdfplant genomics|crop breeding|deep learning|graph deep learning|review
spellingShingle HOU Xiangying, CUI Yunpeng, LIU Juan
Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
Nongye tushu qingbao xuebao
plant genomics|crop breeding|deep learning|graph deep learning|review
title Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
title_full Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
title_fullStr Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
title_full_unstemmed Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
title_short Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding
title_sort applications and prospect analysis of deep learning in plant genomics and crop breeding
topic plant genomics|crop breeding|deep learning|graph deep learning|review
url http://nytsqb.aiijournal.com/fileup/1002-1248/PDF/1002-1248-2022-34-8-4.pdf
work_keys_str_mv AT houxiangyingcuiyunpengliujuan applicationsandprospectanalysisofdeeplearninginplantgenomicsandcropbreeding