DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.

<h4>Motivation</h4>The precise prediction of protein domains, which are the structural, functional and evolutionary units of proteins, has been a research focus in recent years. Although many methods have been presented for predicting protein domains and boundaries, the accuracy of predi...

Full description

Bibliographic Details
Main Authors: Xiao-yan Zhang, Long-jian Lu, Qi Song, Qian-qian Yang, Da-peng Li, Jiang-ming Sun, Tong-hua Li, Pei-sheng Cong
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23593247/?tool=EBI
_version_ 1818403971218276352
author Xiao-yan Zhang
Long-jian Lu
Qi Song
Qian-qian Yang
Da-peng Li
Jiang-ming Sun
Tong-hua Li
Pei-sheng Cong
author_facet Xiao-yan Zhang
Long-jian Lu
Qi Song
Qian-qian Yang
Da-peng Li
Jiang-ming Sun
Tong-hua Li
Pei-sheng Cong
author_sort Xiao-yan Zhang
collection DOAJ
description <h4>Motivation</h4>The precise prediction of protein domains, which are the structural, functional and evolutionary units of proteins, has been a research focus in recent years. Although many methods have been presented for predicting protein domains and boundaries, the accuracy of predictions could be improved.<h4>Results</h4>In this study we present a novel approach, DomHR, which is an accurate predictor of protein domain boundaries based on a creative hinge region strategy. A hinge region was defined as a segment of amino acids that covers part of a domain region and a boundary region. We developed a strategy to construct profiles of domain-hinge-boundary (DHB) features generated by sequence-domain/hinge/boundary alignment against a database of known domain structures. The DHB features had three elements: normalized domain, hinge, and boundary probabilities. The DHB features were used as input to identify domain boundaries in a sequence. DomHR used a nonredundant dataset as the training set, the DHB and predicted shape string as features, and a conditional random field as the classification algorithm. In predicted hinge regions, a residue was determined to be a domain or a boundary according to a decision threshold. After decision thresholds were optimized, DomHR was evaluated by cross-validation, large-scale prediction, independent test and CASP (Critical Assessment of Techniques for Protein Structure Prediction) tests. All results confirmed that DomHR outperformed other well-established, publicly available domain boundary predictors for prediction accuracy.<h4>Availability</h4>The DomHR is available at http://cal.tongji.edu.cn/domain/.
first_indexed 2024-12-14T08:32:44Z
format Article
id doaj.art-738f2c10fc8d443db69807c9c2fc54b5
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-14T08:32:44Z
publishDate 2013-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-738f2c10fc8d443db69807c9c2fc54b52022-12-21T23:09:29ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0184e6055910.1371/journal.pone.0060559DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.Xiao-yan ZhangLong-jian LuQi SongQian-qian YangDa-peng LiJiang-ming SunTong-hua LiPei-sheng Cong<h4>Motivation</h4>The precise prediction of protein domains, which are the structural, functional and evolutionary units of proteins, has been a research focus in recent years. Although many methods have been presented for predicting protein domains and boundaries, the accuracy of predictions could be improved.<h4>Results</h4>In this study we present a novel approach, DomHR, which is an accurate predictor of protein domain boundaries based on a creative hinge region strategy. A hinge region was defined as a segment of amino acids that covers part of a domain region and a boundary region. We developed a strategy to construct profiles of domain-hinge-boundary (DHB) features generated by sequence-domain/hinge/boundary alignment against a database of known domain structures. The DHB features had three elements: normalized domain, hinge, and boundary probabilities. The DHB features were used as input to identify domain boundaries in a sequence. DomHR used a nonredundant dataset as the training set, the DHB and predicted shape string as features, and a conditional random field as the classification algorithm. In predicted hinge regions, a residue was determined to be a domain or a boundary according to a decision threshold. After decision thresholds were optimized, DomHR was evaluated by cross-validation, large-scale prediction, independent test and CASP (Critical Assessment of Techniques for Protein Structure Prediction) tests. All results confirmed that DomHR outperformed other well-established, publicly available domain boundary predictors for prediction accuracy.<h4>Availability</h4>The DomHR is available at http://cal.tongji.edu.cn/domain/.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23593247/?tool=EBI
spellingShingle Xiao-yan Zhang
Long-jian Lu
Qi Song
Qian-qian Yang
Da-peng Li
Jiang-ming Sun
Tong-hua Li
Pei-sheng Cong
DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
PLoS ONE
title DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
title_full DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
title_fullStr DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
title_full_unstemmed DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
title_short DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
title_sort domhr accurately identifying domain boundaries in proteins using a hinge region strategy
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23593247/?tool=EBI
work_keys_str_mv AT xiaoyanzhang domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT longjianlu domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT qisong domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT qianqianyang domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT dapengli domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT jiangmingsun domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT tonghuali domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy
AT peishengcong domhraccuratelyidentifyingdomainboundariesinproteinsusingahingeregionstrategy