Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies

Fish chromosomes are considered homogeneous in their AT/GC nucleotide composition, and banding patterns enabling identification of homologs are largely missing. While cytogenomic approaches try to compensate for this issue by virtual karyotyping, they rely on the quality of genome assemblies availab...

Full description

Bibliographic Details
Main Authors: Marta Vohnoutová, Lucia Žifčáková, Radka Symonová
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Fishes
Subjects:
Online Access:https://www.mdpi.com/2410-3888/8/4/185
_version_ 1797605523322830848
author Marta Vohnoutová
Lucia Žifčáková
Radka Symonová
author_facet Marta Vohnoutová
Lucia Žifčáková
Radka Symonová
author_sort Marta Vohnoutová
collection DOAJ
description Fish chromosomes are considered homogeneous in their AT/GC nucleotide composition, and banding patterns enabling identification of homologs are largely missing. While cytogenomic approaches try to compensate for this issue by virtual karyotyping, they rely on the quality of genome assemblies available. Recently, soft-masked genome assemblies combining costly and arduous long- and short-read sequencing and new generation assemblers became available for two teleost fish species, climbing perch (<i>Anabas testudineus</i>) and channel bull blenny (<i>Cottoperca gobio</i>). Soft-masking turns repetitive sequences in a genome assembly into lower case letters, leaving unique sequences in upper case. This enables investigators to assess the proportion of guanine and cytosine nucleotides (GC%) of transposable elements as an indicator of AT/GC homogenisation in fish. We have developed a new version of our Python tool Evan, which utilises chromosome-level genome assemblies and combines the profiles of GC% and the proportion of repeats (rep%) along chromosomes. Our profiles of both of those fishes showed clear and abrupt but small-scale fluctuations in GC% along otherwise compositionally homogenised sequences. Our study also highlights the key role of the sliding window size in determining the resolution of GC% profiling. While the quality of the genome assemblies appeared to be sufficient for GC%/rep% profiling, more effective repeat masking is necessary to better distinguish to what extent repeats compositionally homogenize fish genomes.
first_indexed 2024-03-11T05:02:18Z
format Article
id doaj.art-9e4c38645f794509bbab919fae977d27
institution Directory Open Access Journal
issn 2410-3888
language English
last_indexed 2024-03-11T05:02:18Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Fishes
spelling doaj.art-9e4c38645f794509bbab919fae977d272023-11-17T19:12:26ZengMDPI AGFishes2410-38882023-03-018418510.3390/fishes8040185Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome AssembliesMarta Vohnoutová0Lucia Žifčáková1Radka Symonová2Department of Computer Science, Faculty of Science, University of South Bohemia in České Budějovice, 370-05 České Budějovice, Czech RepublicOkinawa Institute of Science & Technology Graduate University, 1919-1 Tancha, Onna-son, Okinawa 904-0495, JapanDepartment of Computer Science, Faculty of Science, University of South Bohemia in České Budějovice, 370-05 České Budějovice, Czech RepublicFish chromosomes are considered homogeneous in their AT/GC nucleotide composition, and banding patterns enabling identification of homologs are largely missing. While cytogenomic approaches try to compensate for this issue by virtual karyotyping, they rely on the quality of genome assemblies available. Recently, soft-masked genome assemblies combining costly and arduous long- and short-read sequencing and new generation assemblers became available for two teleost fish species, climbing perch (<i>Anabas testudineus</i>) and channel bull blenny (<i>Cottoperca gobio</i>). Soft-masking turns repetitive sequences in a genome assembly into lower case letters, leaving unique sequences in upper case. This enables investigators to assess the proportion of guanine and cytosine nucleotides (GC%) of transposable elements as an indicator of AT/GC homogenisation in fish. We have developed a new version of our Python tool Evan, which utilises chromosome-level genome assemblies and combines the profiles of GC% and the proportion of repeats (rep%) along chromosomes. Our profiles of both of those fishes showed clear and abrupt but small-scale fluctuations in GC% along otherwise compositionally homogenised sequences. Our study also highlights the key role of the sliding window size in determining the resolution of GC% profiling. While the quality of the genome assemblies appeared to be sufficient for GC%/rep% profiling, more effective repeat masking is necessary to better distinguish to what extent repeats compositionally homogenize fish genomes.https://www.mdpi.com/2410-3888/8/4/185chromosome visualisationvirtual karyotypingGC contentrepetitive sequences
spellingShingle Marta Vohnoutová
Lucia Žifčáková
Radka Symonová
Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
Fishes
chromosome visualisation
virtual karyotyping
GC content
repetitive sequences
title Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
title_full Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
title_fullStr Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
title_full_unstemmed Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
title_short Hidden Compositional Heterogeneity of Fish Chromosomes in the Era of Polished Genome Assemblies
title_sort hidden compositional heterogeneity of fish chromosomes in the era of polished genome assemblies
topic chromosome visualisation
virtual karyotyping
GC content
repetitive sequences
url https://www.mdpi.com/2410-3888/8/4/185
work_keys_str_mv AT martavohnoutova hiddencompositionalheterogeneityoffishchromosomesintheeraofpolishedgenomeassemblies
AT luciazifcakova hiddencompositionalheterogeneityoffishchromosomesintheeraofpolishedgenomeassemblies
AT radkasymonova hiddencompositionalheterogeneityoffishchromosomesintheeraofpolishedgenomeassemblies