Mitochondrial genome plasticity of mammalian species

Abstract There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and nume...

Full description

Bibliographic Details
Main Authors: Bálint Biró, Zoltán Gál, Zsófia Fekete, Eszter Klecska, Orsolya Ivett Hoffmann
Format: Article
Language:English
Published: BMC 2024-03-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-024-10201-9
_version_ 1797259527748321280
author Bálint Biró
Zoltán Gál
Zsófia Fekete
Eszter Klecska
Orsolya Ivett Hoffmann
author_facet Bálint Biró
Zoltán Gál
Zsófia Fekete
Eszter Klecska
Orsolya Ivett Hoffmann
author_sort Bálint Biró
collection DOAJ
description Abstract There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms’ genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa. However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences. Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions’ repetitive elements and different structural characteristics are highly influential during the integration process. In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions.
first_indexed 2024-04-24T23:10:51Z
format Article
id doaj.art-c935ce52ef0943cdb8f05a8bdaf184e7
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-04-24T23:10:51Z
publishDate 2024-03-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-c935ce52ef0943cdb8f05a8bdaf184e72024-03-17T12:16:32ZengBMCBMC Genomics1471-21642024-03-0125111410.1186/s12864-024-10201-9Mitochondrial genome plasticity of mammalian speciesBálint Biró0Zoltán Gál1Zsófia Fekete2Eszter Klecska3Orsolya Ivett Hoffmann4Agribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life SciencesAgribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life SciencesDepartment of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life SciencesFamiCord Group, Krio InstituteAgribiotechnology and Precision Breeding for Food Security National Laboratory, Department of Animal Biotechnology, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life SciencesAbstract There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms’ genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa. However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences. Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions’ repetitive elements and different structural characteristics are highly influential during the integration process. In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions.https://doi.org/10.1186/s12864-024-10201-9NUMTMammalsGenomeBioinformaticsMachine learning
spellingShingle Bálint Biró
Zoltán Gál
Zsófia Fekete
Eszter Klecska
Orsolya Ivett Hoffmann
Mitochondrial genome plasticity of mammalian species
BMC Genomics
NUMT
Mammals
Genome
Bioinformatics
Machine learning
title Mitochondrial genome plasticity of mammalian species
title_full Mitochondrial genome plasticity of mammalian species
title_fullStr Mitochondrial genome plasticity of mammalian species
title_full_unstemmed Mitochondrial genome plasticity of mammalian species
title_short Mitochondrial genome plasticity of mammalian species
title_sort mitochondrial genome plasticity of mammalian species
topic NUMT
Mammals
Genome
Bioinformatics
Machine learning
url https://doi.org/10.1186/s12864-024-10201-9
work_keys_str_mv AT balintbiro mitochondrialgenomeplasticityofmammalianspecies
AT zoltangal mitochondrialgenomeplasticityofmammalianspecies
AT zsofiafekete mitochondrialgenomeplasticityofmammalianspecies
AT eszterklecska mitochondrialgenomeplasticityofmammalianspecies
AT orsolyaivetthoffmann mitochondrialgenomeplasticityofmammalianspecies