“METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”

Abstract Background The improvements in genomics methods coupled with readily accessible high-throughput sequencing have contributed to our understanding of microbial species, metagenomes, infectious diseases and more. To maximize the impact of these genomics studies, it is important that data from...

Full description

Bibliographic Details
Main Authors: Mariam Quiñones, David T. Liou, Conrad Shyu, Wongyu Kim, Ivan Vujkovic-Cvijin, Yasmine Belkaid, Darrell E. Hurt
Format: Article
Language:English
Published: BMC 2020-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03694-0
_version_ 1819144931939188736
author Mariam Quiñones
David T. Liou
Conrad Shyu
Wongyu Kim
Ivan Vujkovic-Cvijin
Yasmine Belkaid
Darrell E. Hurt
author_facet Mariam Quiñones
David T. Liou
Conrad Shyu
Wongyu Kim
Ivan Vujkovic-Cvijin
Yasmine Belkaid
Darrell E. Hurt
author_sort Mariam Quiñones
collection DOAJ
description Abstract Background The improvements in genomics methods coupled with readily accessible high-throughput sequencing have contributed to our understanding of microbial species, metagenomes, infectious diseases and more. To maximize the impact of these genomics studies, it is important that data from biological samples will become publicly available with standardized metadata. The availability of data at public archives provides the hope that greater insights could be obtained through integration with multi-omics data, reproducibility of published studies, or meta-analyses of large diverse datasets. These datasets should include a description of the host, organism, environmental source of the specimen, spatial-temporal information and other relevant metadata, but unfortunately these attributes are often missing and when present, they show inconsistencies in the use of metadata standards and ontologies. Results METAGENOTE ( https://metagenote.niaid.nih.gov ) is a web portal that greatly facilitates the annotation of samples from genomic studies and streamlines the submission process of sequencing files and metadata to the Sequence Read Archive (SRA) (Leinonen R, et al, Nucleic Acids Res, 39:D19-21, 2011) for public access. This platform offers a wide selection of packages for different types of biological and experimental studies with a special emphasis on the standardization of metadata reporting. These packages follow the guidelines from the MIxS standards developed by the Genomics Standard Consortium (GSC) and adopted by the three partners of the International Nucleotides Sequencing Database Collaboration (INSDC) (Cochrane G, et al, Nucleic Acids Res, 44:D48-50, 2016) - National Center for Biotechnology Information (NCBI), European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). METAGENOTE then compiles, validates and manages the submission through an easy-to-use web interface minimizing submission errors and eliminating the need for submitting sequencing files via a separate file transfer mechanism. Conclusions METAGENOTE is a public resource that focuses on simplifying the annotation and submission process of data with its corresponding metadata. Users of METAGENOTE will benefit from the easy to use annotation interface but most importantly will be encouraged to publish metadata following standards and ontologies that make the public data available for reuse.
first_indexed 2024-12-22T12:49:59Z
format Article
id doaj.art-45dc95d6bdfd486d8eb4e06a9835bbf0
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-22T12:49:59Z
publishDate 2020-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-45dc95d6bdfd486d8eb4e06a9835bbf02022-12-21T18:25:14ZengBMCBMC Bioinformatics1471-21052020-09-0121111210.1186/s12859-020-03694-0“METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”Mariam Quiñones0David T. Liou1Conrad Shyu2Wongyu Kim3Ivan Vujkovic-Cvijin4Yasmine Belkaid5Darrell E. Hurt6Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of HealthBioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of HealthBioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of HealthBioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of HealthMetaorganism Immunity Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institute of HealthMetaorganism Immunity Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institute of HealthBioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of HealthAbstract Background The improvements in genomics methods coupled with readily accessible high-throughput sequencing have contributed to our understanding of microbial species, metagenomes, infectious diseases and more. To maximize the impact of these genomics studies, it is important that data from biological samples will become publicly available with standardized metadata. The availability of data at public archives provides the hope that greater insights could be obtained through integration with multi-omics data, reproducibility of published studies, or meta-analyses of large diverse datasets. These datasets should include a description of the host, organism, environmental source of the specimen, spatial-temporal information and other relevant metadata, but unfortunately these attributes are often missing and when present, they show inconsistencies in the use of metadata standards and ontologies. Results METAGENOTE ( https://metagenote.niaid.nih.gov ) is a web portal that greatly facilitates the annotation of samples from genomic studies and streamlines the submission process of sequencing files and metadata to the Sequence Read Archive (SRA) (Leinonen R, et al, Nucleic Acids Res, 39:D19-21, 2011) for public access. This platform offers a wide selection of packages for different types of biological and experimental studies with a special emphasis on the standardization of metadata reporting. These packages follow the guidelines from the MIxS standards developed by the Genomics Standard Consortium (GSC) and adopted by the three partners of the International Nucleotides Sequencing Database Collaboration (INSDC) (Cochrane G, et al, Nucleic Acids Res, 44:D48-50, 2016) - National Center for Biotechnology Information (NCBI), European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). METAGENOTE then compiles, validates and manages the submission through an easy-to-use web interface minimizing submission errors and eliminating the need for submitting sequencing files via a separate file transfer mechanism. Conclusions METAGENOTE is a public resource that focuses on simplifying the annotation and submission process of data with its corresponding metadata. Users of METAGENOTE will benefit from the easy to use annotation interface but most importantly will be encouraged to publish metadata following standards and ontologies that make the public data available for reuse.http://link.springer.com/article/10.1186/s12859-020-03694-0MetadataSequence read archiveOntologiesGenomic samplesWeb platform
spellingShingle Mariam Quiñones
David T. Liou
Conrad Shyu
Wongyu Kim
Ivan Vujkovic-Cvijin
Yasmine Belkaid
Darrell E. Hurt
“METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
BMC Bioinformatics
Metadata
Sequence read archive
Ontologies
Genomic samples
Web platform
title “METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
title_full “METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
title_fullStr “METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
title_full_unstemmed “METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
title_short “METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
title_sort metagenote a simplified web platform for metadata annotation of genomic samples and streamlined submission to ncbi s sequence read archive
topic Metadata
Sequence read archive
Ontologies
Genomic samples
Web platform
url http://link.springer.com/article/10.1186/s12859-020-03694-0
work_keys_str_mv AT mariamquinones metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT davidtliou metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT conradshyu metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT wongyukim metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT ivanvujkoviccvijin metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT yasminebelkaid metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive
AT darrellehurt metagenoteasimplifiedwebplatformformetadataannotationofgenomicsamplesandstreamlinedsubmissiontoncbissequencereadarchive