Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data

The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively...

Full description

Bibliographic Details
Main Authors: Dergilev Arthur I., Orlova Nina G., Dobrovolskaya Oxana B., Orlov Yuriy L.
Format: Article
Language:English
Published: De Gruyter 2021-12-01
Series:Journal of Integrative Bioinformatics
Subjects:
Online Access:https://doi.org/10.1515/jib-2020-0036
_version_ 1811272394910728192
author Dergilev Arthur I.
Orlova Nina G.
Dobrovolskaya Oxana B.
Orlov Yuriy L.
author_facet Dergilev Arthur I.
Orlova Nina G.
Dobrovolskaya Oxana B.
Orlov Yuriy L.
author_sort Dergilev Arthur I.
collection DOAJ
description The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.
first_indexed 2024-04-12T22:39:33Z
format Article
id doaj.art-9012103a1d174c71b5fcdf27a9a9b8f9
institution Directory Open Access Journal
issn 1613-4516
language English
last_indexed 2024-04-12T22:39:33Z
publishDate 2021-12-01
publisher De Gruyter
record_format Article
series Journal of Integrative Bioinformatics
spelling doaj.art-9012103a1d174c71b5fcdf27a9a9b8f92022-12-22T03:13:46ZengDe GruyterJournal of Integrative Bioinformatics1613-45162021-12-011913345210.1515/jib-2020-0036Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq dataDergilev Arthur I.0Orlova Nina G.1Dobrovolskaya Oxana B.2Orlov Yuriy L.3Novosibirsk State University, 630090Novosibirsk, RussiaFinancial University under the Government of the Russian Federation, 125993Moscow, RussiaInstitute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 630090Novosibirsk, RussiaNovosibirsk State University, 630090Novosibirsk, RussiaThe development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.https://doi.org/10.1515/jib-2020-0036chip-seqgene expressionplant genomesregulatory gene networkstranscription factor binding sitestranscription regulation
spellingShingle Dergilev Arthur I.
Orlova Nina G.
Dobrovolskaya Oxana B.
Orlov Yuriy L.
Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
Journal of Integrative Bioinformatics
chip-seq
gene expression
plant genomes
regulatory gene networks
transcription factor binding sites
transcription regulation
title Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
title_full Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
title_fullStr Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
title_full_unstemmed Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
title_short Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
title_sort statistical estimates of multiple transcription factors binding in the model plant genomes based on chip seq data
topic chip-seq
gene expression
plant genomes
regulatory gene networks
transcription factor binding sites
transcription regulation
url https://doi.org/10.1515/jib-2020-0036
work_keys_str_mv AT dergilevarthuri statisticalestimatesofmultipletranscriptionfactorsbindinginthemodelplantgenomesbasedonchipseqdata
AT orlovaninag statisticalestimatesofmultipletranscriptionfactorsbindinginthemodelplantgenomesbasedonchipseqdata
AT dobrovolskayaoxanab statisticalestimatesofmultipletranscriptionfactorsbindinginthemodelplantgenomesbasedonchipseqdata
AT orlovyuriyl statisticalestimatesofmultipletranscriptionfactorsbindinginthemodelplantgenomesbasedonchipseqdata