<it>WordCluster</it>: detecting clusters of DNA words and genomic elements
<p>Abstract</p> <p>Background</p> <p>Many <it>k-</it>mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2011-01-01
|
Series: | Algorithms for Molecular Biology |
Online Access: | http://www.almob.org/content/6/1/2 |
_version_ | 1818210106240663552 |
---|---|
author | Oliver José L Alganza Ángel M Bernaola-Galván Pedro Barturen Guillermo Carpena Pedro Hackenberg Michael |
author_facet | Oliver José L Alganza Ángel M Bernaola-Galván Pedro Barturen Guillermo Carpena Pedro Hackenberg Michael |
author_sort | Oliver José L |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>Many <it>k-</it>mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p> <p>Results</p> <p>We introduce here an algorithm to detect clusters of DNA words (<it>k-</it>mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <it>WordCluster </it>to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p> <p>Conclusions</p> <p><it>WordCluster </it>seems to predict biological meaningful clusters of DNA words (<it>k-</it>mers) and genomic entities. The implementation of the method into a web server is available at <url>http://bioinfo2.ugr.es/wordCluster/wordCluster.php</url> including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p> |
first_indexed | 2024-12-12T05:11:20Z |
format | Article |
id | doaj.art-f60c5491426c4bf2b5d70fedadae2138 |
institution | Directory Open Access Journal |
issn | 1748-7188 |
language | English |
last_indexed | 2024-12-12T05:11:20Z |
publishDate | 2011-01-01 |
publisher | BMC |
record_format | Article |
series | Algorithms for Molecular Biology |
spelling | doaj.art-f60c5491426c4bf2b5d70fedadae21382022-12-22T00:36:54ZengBMCAlgorithms for Molecular Biology1748-71882011-01-0161210.1186/1748-7188-6-2<it>WordCluster</it>: detecting clusters of DNA words and genomic elementsOliver José LAlganza Ángel MBernaola-Galván PedroBarturen GuillermoCarpena PedroHackenberg Michael<p>Abstract</p> <p>Background</p> <p>Many <it>k-</it>mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p> <p>Results</p> <p>We introduce here an algorithm to detect clusters of DNA words (<it>k-</it>mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <it>WordCluster </it>to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p> <p>Conclusions</p> <p><it>WordCluster </it>seems to predict biological meaningful clusters of DNA words (<it>k-</it>mers) and genomic entities. The implementation of the method into a web server is available at <url>http://bioinfo2.ugr.es/wordCluster/wordCluster.php</url> including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p>http://www.almob.org/content/6/1/2 |
spellingShingle | Oliver José L Alganza Ángel M Bernaola-Galván Pedro Barturen Guillermo Carpena Pedro Hackenberg Michael <it>WordCluster</it>: detecting clusters of DNA words and genomic elements Algorithms for Molecular Biology |
title | <it>WordCluster</it>: detecting clusters of DNA words and genomic elements |
title_full | <it>WordCluster</it>: detecting clusters of DNA words and genomic elements |
title_fullStr | <it>WordCluster</it>: detecting clusters of DNA words and genomic elements |
title_full_unstemmed | <it>WordCluster</it>: detecting clusters of DNA words and genomic elements |
title_short | <it>WordCluster</it>: detecting clusters of DNA words and genomic elements |
title_sort | it wordcluster it detecting clusters of dna words and genomic elements |
url | http://www.almob.org/content/6/1/2 |
work_keys_str_mv | AT oliverjosel itwordclusteritdetectingclustersofdnawordsandgenomicelements AT alganzaangelm itwordclusteritdetectingclustersofdnawordsandgenomicelements AT bernaolagalvanpedro itwordclusteritdetectingclustersofdnawordsandgenomicelements AT barturenguillermo itwordclusteritdetectingclustersofdnawordsandgenomicelements AT carpenapedro itwordclusteritdetectingclustersofdnawordsandgenomicelements AT hackenbergmichael itwordclusteritdetectingclustersofdnawordsandgenomicelements |