Summary: | Heat shock protein 20 (HSP20) serves as a chaperone and plays roles in numerous biological processes, but the codon usage bias (CUB) of its genes has remained unexplored. This study identified 140 <i>HSP20</i> genes from four cruciferous species, <i>Arabidopsis thaliana</i>, <i>Brassica napus</i>, <i>Brassica rapa</i>, and <i>Camelina sativa</i>, that were identified from the Ensembl plants database, and we subsequently investigated their CUB. As a result, the base composition analysis revealed that the overall GC content of <i>HSP20</i> genes was below 50%. The overall GC content significantly correlated with the constituents at three codon positions, implying that both mutation pressure and natural selection might contribute to the CUB. The relatively high ENc values suggested that the CUB of the <i>HSP20</i> genes in four cruciferous species was relatively weak. Subsequently, ENc exhibited a negative correlation with gene expression levels. Analyses, including ENc-plot analysis, neutral analysis, and PR2 bias, revealed that natural selection mainly shaped the CUB patterns of <i>HSP20</i> genes in these species. In addition, a total of 12 optimal codons (ΔRSCU > 0.08 and RSCU > 1) were identified across the four species. A neighbor-joining phylogenetic analysis based on coding sequences (CDS) showed that the 140 <i>HSP20</i> genes were strictly and distinctly clustered into 12 subfamilies. Principal component analysis and cluster analysis based on relative synonymous codon usage (RSCU) values supported the fact that the CUB pattern was consistent with the genetic relationship at the gene level and (or) species levels. These results will not only enrich the <i>HSP20</i> gene resource but also advance our understanding of the CUB of <i>HSP20</i> genes, which may underlie the theoretical basis for exploration of their genetic and evolutionary pattern.
|