rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects

Abstract Background The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evo...

Full description

Bibliographic Details
Main Authors: Manon Geerts, Achim Schnaufer, Frederik Van den Broeck
Format: Article
Language:English
Published: BMC 2021-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-021-04384-1
_version_ 1818885095667269632
author Manon Geerts
Achim Schnaufer
Frederik Van den Broeck
author_facet Manon Geerts
Achim Schnaufer
Frederik Van den Broeck
author_sort Manon Geerts
collection DOAJ
description Abstract Background The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution of their mitochondrial genomes remains fragmentary, mainly because of their extraordinary complexity. Indeed, unicellular flagellates of the order Kinetoplastida contain structurally the most complex mitochondrial genome of all eukaryotes, organized as a giant network of homogeneous maxicircles and heterogeneous minicircles. We recently developed KOMICS, an analysis toolkit that automates the assembly and circularization of the mitochondrial genomes of Kinetoplastid parasites. While this tool overcomes the limitation of extracting mitochondrial assemblies from Next-Generation Sequencing datasets, interpreting and visualizing the genetic (dis)similarity within and between samples remains a time-consuming process. Results Here, we present a new analysis toolkit—rKOMICS—to streamline the analyses of minicircle sequence diversity in population-scale genome projects. rKOMICS is a user-friendly R package that has simple installation requirements and that is applicable to all 27 trypanosomatid genera. Once minicircle sequence alignments are generated, rKOMICS allows to examine, summarize and visualize minicircle sequence diversity within and between samples through the analyses of minicircle sequence clusters. We showcase the functionalities of the (r)KOMICS tool suite using a whole-genome sequencing dataset from a recently published study on the history of diversification of the Leishmania braziliensis species complex in Peru. Analyses of population diversity and structure highlighted differences in minicircle sequence richness and composition between Leishmania subspecies, and between subpopulations within subspecies. Conclusion The rKOMICS package establishes a critical framework to manipulate, explore and extract biologically relevant information from mitochondrial minicircle assemblies in tens to hundreds of samples simultaneously and efficiently. This should facilitate research that aims to develop new molecular markers for identifying species-specific minicircles, or to study the ancestry of parasites for complementary insights into their evolutionary history.
first_indexed 2024-12-19T16:00:00Z
format Article
id doaj.art-8ef751dc6a4c491db8803820bfe28099
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-19T16:00:00Z
publishDate 2021-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-8ef751dc6a4c491db8803820bfe280992022-12-21T20:14:57ZengBMCBMC Bioinformatics1471-21052021-09-0122111410.1186/s12859-021-04384-1rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projectsManon Geerts0Achim Schnaufer1Frederik Van den Broeck2Department of Biomedical Sciences, Institute of Tropical MedicineInstitute of Immunology and Infection Research, University of EdinburghDepartment of Biomedical Sciences, Institute of Tropical MedicineAbstract Background The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution of their mitochondrial genomes remains fragmentary, mainly because of their extraordinary complexity. Indeed, unicellular flagellates of the order Kinetoplastida contain structurally the most complex mitochondrial genome of all eukaryotes, organized as a giant network of homogeneous maxicircles and heterogeneous minicircles. We recently developed KOMICS, an analysis toolkit that automates the assembly and circularization of the mitochondrial genomes of Kinetoplastid parasites. While this tool overcomes the limitation of extracting mitochondrial assemblies from Next-Generation Sequencing datasets, interpreting and visualizing the genetic (dis)similarity within and between samples remains a time-consuming process. Results Here, we present a new analysis toolkit—rKOMICS—to streamline the analyses of minicircle sequence diversity in population-scale genome projects. rKOMICS is a user-friendly R package that has simple installation requirements and that is applicable to all 27 trypanosomatid genera. Once minicircle sequence alignments are generated, rKOMICS allows to examine, summarize and visualize minicircle sequence diversity within and between samples through the analyses of minicircle sequence clusters. We showcase the functionalities of the (r)KOMICS tool suite using a whole-genome sequencing dataset from a recently published study on the history of diversification of the Leishmania braziliensis species complex in Peru. Analyses of population diversity and structure highlighted differences in minicircle sequence richness and composition between Leishmania subspecies, and between subpopulations within subspecies. Conclusion The rKOMICS package establishes a critical framework to manipulate, explore and extract biologically relevant information from mitochondrial minicircle assemblies in tens to hundreds of samples simultaneously and efficiently. This should facilitate research that aims to develop new molecular markers for identifying species-specific minicircles, or to study the ancestry of parasites for complementary insights into their evolutionary history.https://doi.org/10.1186/s12859-021-04384-1AssemblyClusteringMinicirclesSequencingKinetoplastLeishmania
spellingShingle Manon Geerts
Achim Schnaufer
Frederik Van den Broeck
rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
BMC Bioinformatics
Assembly
Clustering
Minicircles
Sequencing
Kinetoplast
Leishmania
title rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
title_full rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
title_fullStr rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
title_full_unstemmed rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
title_short rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects
title_sort rkomics an r package for processing mitochondrial minicircle assemblies in population scale genome projects
topic Assembly
Clustering
Minicircles
Sequencing
Kinetoplast
Leishmania
url https://doi.org/10.1186/s12859-021-04384-1
work_keys_str_mv AT manongeerts rkomicsanrpackageforprocessingmitochondrialminicircleassembliesinpopulationscalegenomeprojects
AT achimschnaufer rkomicsanrpackageforprocessingmitochondrialminicircleassembliesinpopulationscalegenomeprojects
AT frederikvandenbroeck rkomicsanrpackageforprocessingmitochondrialminicircleassembliesinpopulationscalegenomeprojects