Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics

<p>Abstract</p> <p>Background</p> <p>Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous...

Full description

Bibliographic Details
Main Authors: Deng Suping, Shi Yixiang, Yuan Liyun, Li Yixue, Ding Guohui
Format: Article
Language:English
Published: BMC 2012-12-01
Series:BMC Genomics
_version_ 1811315225212747776
author Deng Suping
Shi Yixiang
Yuan Liyun
Li Yixue
Ding Guohui
author_facet Deng Suping
Shi Yixiang
Yuan Liyun
Li Yixue
Ding Guohui
author_sort Deng Suping
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved.</p> <p>Methods</p> <p>In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets.</p> <p>Results</p> <p>Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, <it>Rickettsia prowazekii, Borrelia burgdorferi </it>and <it>E.coli</it>, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences.</p> <p>Conclusions</p> <p>This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences.</p>
first_indexed 2024-04-13T11:26:26Z
format Article
id doaj.art-728bb0ef07b44d578fd27308ea2e55c2
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-04-13T11:26:26Z
publishDate 2012-12-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-728bb0ef07b44d578fd27308ea2e55c22022-12-22T02:48:41ZengBMCBMC Genomics1471-21642012-12-0113Suppl 8S1910.1186/1471-2164-13-S8-S19Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statisticsDeng SupingShi YixiangYuan LiyunLi YixueDing Guohui<p>Abstract</p> <p>Background</p> <p>Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved.</p> <p>Methods</p> <p>In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets.</p> <p>Results</p> <p>Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, <it>Rickettsia prowazekii, Borrelia burgdorferi </it>and <it>E.coli</it>, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences.</p> <p>Conclusions</p> <p>This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences.</p>
spellingShingle Deng Suping
Shi Yixiang
Yuan Liyun
Li Yixue
Ding Guohui
Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
BMC Genomics
title Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_full Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_fullStr Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_full_unstemmed Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_short Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
title_sort detecting the borders between coding and non coding dna regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics
work_keys_str_mv AT dengsuping detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT shiyixiang detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT yuanliyun detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT liyixue detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics
AT dingguohui detectingthebordersbetweencodingandnoncodingdnaregionsinprokaryotesbasedonrecursivesegmentationandnucleotidedoubletsstatistics