CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.

A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their simil...

Full description

Bibliographic Details
Main Authors: Shaoqiang Zhang, Yong Chen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4972426?pdf=render
_version_ 1819110314867687424
author Shaoqiang Zhang
Yong Chen
author_facet Shaoqiang Zhang
Yong Chen
author_sort Shaoqiang Zhang
collection DOAJ
description A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html.
first_indexed 2024-12-22T03:39:45Z
format Article
id doaj.art-9892193aa6bd49189924ec5138ac3adf
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-22T03:39:45Z
publishDate 2016-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-9892193aa6bd49189924ec5138ac3adf2022-12-21T18:40:16ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01118e016043510.1371/journal.pone.0160435CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.Shaoqiang ZhangYong ChenA set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html.http://europepmc.org/articles/PMC4972426?pdf=render
spellingShingle Shaoqiang Zhang
Yong Chen
CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
PLoS ONE
title CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
title_full CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
title_fullStr CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
title_full_unstemmed CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
title_short CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.
title_sort climp clustering motifs via maximal cliques with parallel computing design
url http://europepmc.org/articles/PMC4972426?pdf=render
work_keys_str_mv AT shaoqiangzhang climpclusteringmotifsviamaximalcliqueswithparallelcomputingdesign
AT yongchen climpclusteringmotifsviamaximalcliqueswithparallelcomputingdesign