CNCA aligns small annotated genomes
Abstract Background To explore the evolutionary history of sequences, a sequence alignment is a first and necessary step, and its quality is crucial. In the context of the study of the proximal origins of SARS-CoV-2 coronavirus, we wanted to construct an alignment of genomes closely related to SARS-...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-02-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-024-05700-1 |
_version_ | 1797273053384671232 |
---|---|
author | Jean-Noël Lorenzi François Graner Virginie Courtier-Orgogozo Guillaume Achaz |
author_facet | Jean-Noël Lorenzi François Graner Virginie Courtier-Orgogozo Guillaume Achaz |
author_sort | Jean-Noël Lorenzi |
collection | DOAJ |
description | Abstract Background To explore the evolutionary history of sequences, a sequence alignment is a first and necessary step, and its quality is crucial. In the context of the study of the proximal origins of SARS-CoV-2 coronavirus, we wanted to construct an alignment of genomes closely related to SARS-CoV-2 using both coding and non-coding sequences. To our knowledge, there is no tool that can be used to construct this type of alignment, which motivated the creation of CNCA. Results CNCA is a web tool that aligns annotated genomes from GenBank files. It generates a nucleotide alignment that is then updated based on the protein sequence alignment. The output final nucleotide alignment matches the protein alignment and guarantees no frameshift. CNCA was designed to align closely related small genome sequences up to 50 kb (typically viruses) for which the gene order is conserved. Conclusions CNCA constructs multiple alignments of small genomes by integrating both coding and non-coding sequences. This preserves regions traditionally ignored in conventional back-translation methods, such as non-coding regions. |
first_indexed | 2024-03-07T14:38:00Z |
format | Article |
id | doaj.art-ace9c721d87c4923b2543bd3c4cd3250 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-07T14:38:00Z |
publishDate | 2024-02-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-ace9c721d87c4923b2543bd3c4cd32502024-03-05T20:31:38ZengBMCBMC Bioinformatics1471-21052024-02-012511410.1186/s12859-024-05700-1CNCA aligns small annotated genomesJean-Noël Lorenzi0François Graner1Virginie Courtier-Orgogozo2Guillaume Achaz3Université Paris CitéUniversité Paris CitéUniversité Paris CitéSMILE Group, Center for Interdisciplinary Research in Biology (CIRB), Collège de FranceAbstract Background To explore the evolutionary history of sequences, a sequence alignment is a first and necessary step, and its quality is crucial. In the context of the study of the proximal origins of SARS-CoV-2 coronavirus, we wanted to construct an alignment of genomes closely related to SARS-CoV-2 using both coding and non-coding sequences. To our knowledge, there is no tool that can be used to construct this type of alignment, which motivated the creation of CNCA. Results CNCA is a web tool that aligns annotated genomes from GenBank files. It generates a nucleotide alignment that is then updated based on the protein sequence alignment. The output final nucleotide alignment matches the protein alignment and guarantees no frameshift. CNCA was designed to align closely related small genome sequences up to 50 kb (typically viruses) for which the gene order is conserved. Conclusions CNCA constructs multiple alignments of small genomes by integrating both coding and non-coding sequences. This preserves regions traditionally ignored in conventional back-translation methods, such as non-coding regions.https://doi.org/10.1186/s12859-024-05700-1Annotated genomesNucleotide alignmentProtein alignment |
spellingShingle | Jean-Noël Lorenzi François Graner Virginie Courtier-Orgogozo Guillaume Achaz CNCA aligns small annotated genomes BMC Bioinformatics Annotated genomes Nucleotide alignment Protein alignment |
title | CNCA aligns small annotated genomes |
title_full | CNCA aligns small annotated genomes |
title_fullStr | CNCA aligns small annotated genomes |
title_full_unstemmed | CNCA aligns small annotated genomes |
title_short | CNCA aligns small annotated genomes |
title_sort | cnca aligns small annotated genomes |
topic | Annotated genomes Nucleotide alignment Protein alignment |
url | https://doi.org/10.1186/s12859-024-05700-1 |
work_keys_str_mv | AT jeannoellorenzi cncaalignssmallannotatedgenomes AT francoisgraner cncaalignssmallannotatedgenomes AT virginiecourtierorgogozo cncaalignssmallannotatedgenomes AT guillaumeachaz cncaalignssmallannotatedgenomes |