A novel hierarchical clustering algorithm for gene sequences
<p>Abstract</p> <p>Background</p> <p>Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms D...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2012-07-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/13/174 |
_version_ | 1819115181168394240 |
---|---|
author | Wei Dan Jiang Qingshan Wei Yanjie Wang Shengrui |
author_facet | Wei Dan Jiang Qingshan Wei Yanjie Wang Shengrui |
author_sort | Wei Dan |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of <it>k</it>-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors.</p> <p>Results</p> <p>The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences.</p> <p>Conclusions</p> <p>We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.</p> |
first_indexed | 2024-12-22T04:57:06Z |
format | Article |
id | doaj.art-51778ac00ffd4d6bb0fb4c31178c2f86 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-22T04:57:06Z |
publishDate | 2012-07-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-51778ac00ffd4d6bb0fb4c31178c2f862022-12-21T18:38:20ZengBMCBMC Bioinformatics1471-21052012-07-0113117410.1186/1471-2105-13-174A novel hierarchical clustering algorithm for gene sequencesWei DanJiang QingshanWei YanjieWang Shengrui<p>Abstract</p> <p>Background</p> <p>Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of <it>k</it>-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors.</p> <p>Results</p> <p>The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences.</p> <p>Conclusions</p> <p>We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.</p>http://www.biomedcentral.com/1471-2105/13/174 |
spellingShingle | Wei Dan Jiang Qingshan Wei Yanjie Wang Shengrui A novel hierarchical clustering algorithm for gene sequences BMC Bioinformatics |
title | A novel hierarchical clustering algorithm for gene sequences |
title_full | A novel hierarchical clustering algorithm for gene sequences |
title_fullStr | A novel hierarchical clustering algorithm for gene sequences |
title_full_unstemmed | A novel hierarchical clustering algorithm for gene sequences |
title_short | A novel hierarchical clustering algorithm for gene sequences |
title_sort | novel hierarchical clustering algorithm for gene sequences |
url | http://www.biomedcentral.com/1471-2105/13/174 |
work_keys_str_mv | AT weidan anovelhierarchicalclusteringalgorithmforgenesequences AT jiangqingshan anovelhierarchicalclusteringalgorithmforgenesequences AT weiyanjie anovelhierarchicalclusteringalgorithmforgenesequences AT wangshengrui anovelhierarchicalclusteringalgorithmforgenesequences AT weidan novelhierarchicalclusteringalgorithmforgenesequences AT jiangqingshan novelhierarchicalclusteringalgorithmforgenesequences AT weiyanjie novelhierarchicalclusteringalgorithmforgenesequences AT wangshengrui novelhierarchicalclusteringalgorithmforgenesequences |