Chaos game representation dataset of SARS-CoV-2 genome

As of April 16, 2020, the novel coronavirus disease (called COVID-19) spread to more than 185 countries/regions with more than 142,000 deaths and more than 2,000,000 confirmed cases. In the bioinformatics area, one of the crucial points is the analysis of the virus nucleotide sequences using approac...

Full description

Bibliographic Details
Main Authors: Barbosa, Raquel de M., Fernandes, Marcelo A.C.
Other Authors: Massachusetts Institute of Technology. Department of Chemical Engineering
Format: Article
Published: Elsevier BV 2020
Online Access:https://hdl.handle.net/1721.1/125023
_version_ 1826192348621570048
author Barbosa, Raquel de M.
Fernandes, Marcelo A.C.
author2 Massachusetts Institute of Technology. Department of Chemical Engineering
author_facet Massachusetts Institute of Technology. Department of Chemical Engineering
Barbosa, Raquel de M.
Fernandes, Marcelo A.C.
author_sort Barbosa, Raquel de M.
collection MIT
description As of April 16, 2020, the novel coronavirus disease (called COVID-19) spread to more than 185 countries/regions with more than 142,000 deaths and more than 2,000,000 confirmed cases. In the bioinformatics area, one of the crucial points is the analysis of the virus nucleotide sequences using approaches such as data stream, digital signal processing, and machine learning techniques and algorithms. However, to make feasible this approach, it is necessary to transform the nucleotide sequences string to numerical values representation. Thus, the dataset provides a chaos game representation (CGR) of SARS-CoV-2 virus nucleotide sequences. The dataset provides the CGR of 100 instances of SARS-CoV-2 virus, 11540 instances of other viruses from the Virus-Host DB dataset, and three instances of Riboviria viruses from NCBI (Betacoronavirus RaTG13, bat-SL-CoVZC45, and bat-SL-CoVZXC21).
first_indexed 2024-09-23T09:10:24Z
format Article
id mit-1721.1/125023
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T09:10:24Z
publishDate 2020
publisher Elsevier BV
record_format dspace
spelling mit-1721.1/1250232022-09-30T13:56:28Z Chaos game representation dataset of SARS-CoV-2 genome Barbosa, Raquel de M. Fernandes, Marcelo A.C. Massachusetts Institute of Technology. Department of Chemical Engineering As of April 16, 2020, the novel coronavirus disease (called COVID-19) spread to more than 185 countries/regions with more than 142,000 deaths and more than 2,000,000 confirmed cases. In the bioinformatics area, one of the crucial points is the analysis of the virus nucleotide sequences using approaches such as data stream, digital signal processing, and machine learning techniques and algorithms. However, to make feasible this approach, it is necessary to transform the nucleotide sequences string to numerical values representation. Thus, the dataset provides a chaos game representation (CGR) of SARS-CoV-2 virus nucleotide sequences. The dataset provides the CGR of 100 instances of SARS-CoV-2 virus, 11540 instances of other viruses from the Virus-Host DB dataset, and three instances of Riboviria viruses from NCBI (Betacoronavirus RaTG13, bat-SL-CoVZC45, and bat-SL-CoVZXC21). 2020-05-05T18:16:39Z 2020-05-05T18:16:39Z 2020-06 2020-04 Article http://purl.org/eprint/type/JournalArticle 2352-3409 https://hdl.handle.net/1721.1/125023 Barbosa, Raquel de M. and Marcelo A.C.Fernandes. "Chaos game representation dataset of SARS-CoV-2 genome." Data in Brief 30 (June 2020): 105618 © 2020 Elsevier http://dx.doi.org/10.1016/j.dib.2020.105618 Data in Brief Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Elsevier BV Elsevier
spellingShingle Barbosa, Raquel de M.
Fernandes, Marcelo A.C.
Chaos game representation dataset of SARS-CoV-2 genome
title Chaos game representation dataset of SARS-CoV-2 genome
title_full Chaos game representation dataset of SARS-CoV-2 genome
title_fullStr Chaos game representation dataset of SARS-CoV-2 genome
title_full_unstemmed Chaos game representation dataset of SARS-CoV-2 genome
title_short Chaos game representation dataset of SARS-CoV-2 genome
title_sort chaos game representation dataset of sars cov 2 genome
url https://hdl.handle.net/1721.1/125023
work_keys_str_mv AT barbosaraqueldem chaosgamerepresentationdatasetofsarscov2genome
AT fernandesmarceloac chaosgamerepresentationdatasetofsarscov2genome