Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences

Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by expert...

Full description

Bibliographic Details
Main Authors: Santos, Miguel A., Turinsky, Andrei L., Ong, Serene, Tsai, Jennifer, Berger, Michael F., Badis, Gwenael, Talukder, Shaheynoor, Gehrke, Andrew R., Hughes, Timothy R., Wodak, Shoshana J., Bulyk, Martha L.
Other Authors: Harvard University--MIT Division of Health Sciences and Technology
Format: Article
Language:en_US
Published: Oxford University Press (OUP) 2012
Online Access:http://hdl.handle.net/1721.1/70985
_version_ 1826203868664430592
author Santos, Miguel A.
Turinsky, Andrei L.
Ong, Serene
Tsai, Jennifer
Berger, Michael F.
Badis, Gwenael
Talukder, Shaheynoor
Gehrke, Andrew R.
Hughes, Timothy R.
Wodak, Shoshana J.
Bulyk, Martha L.
author2 Harvard University--MIT Division of Health Sciences and Technology
author_facet Harvard University--MIT Division of Health Sciences and Technology
Santos, Miguel A.
Turinsky, Andrei L.
Ong, Serene
Tsai, Jennifer
Berger, Michael F.
Badis, Gwenael
Talukder, Shaheynoor
Gehrke, Andrew R.
Hughes, Timothy R.
Wodak, Shoshana J.
Bulyk, Martha L.
author_sort Santos, Miguel A.
collection MIT
description Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by experts using mainly qualitative descriptions of molecular function. Recently, in vitro DNA-binding preferences to all possible 8-nt DNA sequences have been measured for 178 mouse homeodomains using protein-binding microarrays, offering the unprecedented opportunity of evaluating the classification methods against quantitative measures of molecular function. To this end, we automatically derive homeodomain subtypes from the DNA-binding data and independently group the same domains using sequence information alone. We test five sequence-based methods, which use different sequence-similarity measures and algorithms to group sequences. Results show that methods that optimize the classification robustness reflect well the detailed functional specificity revealed by the experimental data. In some of these classifications, 73–83% of the subfamilies exactly correspond to, or are completely contained in, the function-based subtypes. Our findings demonstrate that certain sequence-based classifications are capable of yielding very specific molecular function annotations. The availability of quantitative descriptions of molecular function, such as DNA-binding data, will be a key factor in exploiting this potential in the future.
first_indexed 2024-09-23T12:44:34Z
format Article
id mit-1721.1/70985
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T12:44:34Z
publishDate 2012
publisher Oxford University Press (OUP)
record_format dspace
spelling mit-1721.1/709852022-10-01T10:51:17Z Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences Santos, Miguel A. Turinsky, Andrei L. Ong, Serene Tsai, Jennifer Berger, Michael F. Badis, Gwenael Talukder, Shaheynoor Gehrke, Andrew R. Hughes, Timothy R. Wodak, Shoshana J. Bulyk, Martha L. Harvard University--MIT Division of Health Sciences and Technology Bulyk, Martha L. Bulyk, Martha L. Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by experts using mainly qualitative descriptions of molecular function. Recently, in vitro DNA-binding preferences to all possible 8-nt DNA sequences have been measured for 178 mouse homeodomains using protein-binding microarrays, offering the unprecedented opportunity of evaluating the classification methods against quantitative measures of molecular function. To this end, we automatically derive homeodomain subtypes from the DNA-binding data and independently group the same domains using sequence information alone. We test five sequence-based methods, which use different sequence-similarity measures and algorithms to group sequences. Results show that methods that optimize the classification robustness reflect well the detailed functional specificity revealed by the experimental data. In some of these classifications, 73–83% of the subfamilies exactly correspond to, or are completely contained in, the function-based subtypes. Our findings demonstrate that certain sequence-based classifications are capable of yielding very specific molecular function annotations. The availability of quantitative descriptions of molecular function, such as DNA-binding data, will be a key factor in exploiting this potential in the future. Canadian Institutes of Health Research (MOP#82940) Sickkids Foundation Ontario Research Fund National Science Foundation (U.S.) National Human Genome Research Institute (U.S.) (R01 HG003985) 2012-06-01T16:35:19Z 2012-06-01T16:35:19Z 2010-08 2010-07 Article http://purl.org/eprint/type/JournalArticle 0305-1048 1362-4962 http://hdl.handle.net/1721.1/70985 Santos, M. A. et al. “Objective Sequence-based Subfamily Classifications of Mouse Homeodomains Reflect Their in Vitro DNA-binding Preferences.” Nucleic Acids Research 38.22 (2010): 7927–7942. Web. 1 June 2012. en_US http://dx.doi.org/10.1093/nar/gkq714 Nucleic Acids Research Creative Commons Attribution Non-Commercial http://creativecommons.org/licenses/by-nc/2.5 application/pdf Oxford University Press (OUP) Oxford
spellingShingle Santos, Miguel A.
Turinsky, Andrei L.
Ong, Serene
Tsai, Jennifer
Berger, Michael F.
Badis, Gwenael
Talukder, Shaheynoor
Gehrke, Andrew R.
Hughes, Timothy R.
Wodak, Shoshana J.
Bulyk, Martha L.
Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title_full Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title_fullStr Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title_full_unstemmed Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title_short Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
title_sort objective sequence based subfamily classifications of mouse homeodomains reflect their in vitro dna binding preferences
url http://hdl.handle.net/1721.1/70985
work_keys_str_mv AT santosmiguela objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT turinskyandreil objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT ongserene objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT tsaijennifer objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT bergermichaelf objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT badisgwenael objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT talukdershaheynoor objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT gehrkeandrewr objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT hughestimothyr objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT wodakshoshanaj objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences
AT bulykmarthal objectivesequencebasedsubfamilyclassificationsofmousehomeodomainsreflecttheirinvitrodnabindingpreferences