Genomic data integration tutorial, a plant case study
Abstract Background The ongoing evolution of the Next Generation Sequencing (NGS) technologies has led to the production of genomic data on a massive scale. While tools for genomic data integration and analysis are becoming increasingly available, the conceptual and analytical complexities still rep...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-01-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-023-09833-0 |
_version_ | 1827377441380761600 |
---|---|
author | Emile Mardoc Mamadou Dia Sow Sébastien Déjean Jérôme Salse |
author_facet | Emile Mardoc Mamadou Dia Sow Sébastien Déjean Jérôme Salse |
author_sort | Emile Mardoc |
collection | DOAJ |
description | Abstract Background The ongoing evolution of the Next Generation Sequencing (NGS) technologies has led to the production of genomic data on a massive scale. While tools for genomic data integration and analysis are becoming increasingly available, the conceptual and analytical complexities still represent a great challenge in many biological contexts. Results To address this issue, we describe a six-steps tutorial for the best practices in genomic data integration, consisting of (1) designing a data matrix; (2) formulating a specific biological question toward data description, selection and prediction; (3) selecting a tool adapted to the targeted questions; (4) preprocessing of the data; (5) conducting preliminary analysis, and finally (6) executing genomic data integration. Conclusion The tutorial has been tested and demonstrated on publicly available genomic data generated from poplar (Populus L.), a woody plant model. We also developed a new graphical output for the unsupervised multi-block analysis, cimDiablo_v2, available at https://forgemia.inra.fr/umr-gdec/omics-integration-on-poplar , and allowing the selection of master drivers in genomic data variation and interplay. |
first_indexed | 2024-03-08T12:40:14Z |
format | Article |
id | doaj.art-4436600db7f7408e976679a3951ed4e6 |
institution | Directory Open Access Journal |
issn | 1471-2164 |
language | English |
last_indexed | 2024-03-08T12:40:14Z |
publishDate | 2024-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj.art-4436600db7f7408e976679a3951ed4e62024-01-21T12:11:43ZengBMCBMC Genomics1471-21642024-01-0125111510.1186/s12864-023-09833-0Genomic data integration tutorial, a plant case studyEmile Mardoc0Mamadou Dia Sow1Sébastien Déjean2Jérôme Salse3UCA-INRAE UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC)UCA-INRAE UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC)Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS, Université Paul SabatierUCA-INRAE UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC)Abstract Background The ongoing evolution of the Next Generation Sequencing (NGS) technologies has led to the production of genomic data on a massive scale. While tools for genomic data integration and analysis are becoming increasingly available, the conceptual and analytical complexities still represent a great challenge in many biological contexts. Results To address this issue, we describe a six-steps tutorial for the best practices in genomic data integration, consisting of (1) designing a data matrix; (2) formulating a specific biological question toward data description, selection and prediction; (3) selecting a tool adapted to the targeted questions; (4) preprocessing of the data; (5) conducting preliminary analysis, and finally (6) executing genomic data integration. Conclusion The tutorial has been tested and demonstrated on publicly available genomic data generated from poplar (Populus L.), a woody plant model. We also developed a new graphical output for the unsupervised multi-block analysis, cimDiablo_v2, available at https://forgemia.inra.fr/umr-gdec/omics-integration-on-poplar , and allowing the selection of master drivers in genomic data variation and interplay.https://doi.org/10.1186/s12864-023-09833-0OmicsIntegrationSystemBiology |
spellingShingle | Emile Mardoc Mamadou Dia Sow Sébastien Déjean Jérôme Salse Genomic data integration tutorial, a plant case study BMC Genomics Omics Integration System Biology |
title | Genomic data integration tutorial, a plant case study |
title_full | Genomic data integration tutorial, a plant case study |
title_fullStr | Genomic data integration tutorial, a plant case study |
title_full_unstemmed | Genomic data integration tutorial, a plant case study |
title_short | Genomic data integration tutorial, a plant case study |
title_sort | genomic data integration tutorial a plant case study |
topic | Omics Integration System Biology |
url | https://doi.org/10.1186/s12864-023-09833-0 |
work_keys_str_mv | AT emilemardoc genomicdataintegrationtutorialaplantcasestudy AT mamadoudiasow genomicdataintegrationtutorialaplantcasestudy AT sebastiendejean genomicdataintegrationtutorialaplantcasestudy AT jeromesalse genomicdataintegrationtutorialaplantcasestudy |