Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]

The protein sequence space is vast and diverse, spanning across different families. Biologically meaningful relationships exist between proteins at superfamily level. However, it is highly challenging to establish convincing relationships at the superfamily level by means of simple sequence searches...

Full description

Bibliographic Details
Main Authors: Adwait Govind Joshi, Upadhyayula Surya Raghavender, Ramanathan Sowdhamini
Format: Article
Language:English
Published: F1000 Research Ltd 2014-07-01
Series:F1000Research
Subjects:
Online Access:http://f1000research.com/articles/2-93/v2
_version_ 1811336969516482560
author Adwait Govind Joshi
Upadhyayula Surya Raghavender
Ramanathan Sowdhamini
author_facet Adwait Govind Joshi
Upadhyayula Surya Raghavender
Ramanathan Sowdhamini
author_sort Adwait Govind Joshi
collection DOAJ
description The protein sequence space is vast and diverse, spanning across different families. Biologically meaningful relationships exist between proteins at superfamily level. However, it is highly challenging to establish convincing relationships at the superfamily level by means of simple sequence searches. It is necessary to design a rigorous sequence search strategy to establish remote homology relationships and achieve high coverage. We have used iterative profile-based methods, along with constraints of sequence motifs, to specify search directions. We address the importance of multiple start points (queries) to achieve high coverage at protein superfamily level. We have devised strategies to employ a structural regime to search sequence space with good specificity and sensitivity. We employ two well-known sequence search methods, PSI-BLAST and PHI-BLAST, with multiple queries and multiple patterns to enhance homologue identification at the structural superfamily level. The study suggests that multiple queries improve sensitivity, while a pattern-constrained iterative sequence search becomes stringent at the initial stages, thereby driving the search in a specific direction and also achieves high coverage. This data mining approach has been applied to the entire structural superfamily database.
first_indexed 2024-04-13T17:47:17Z
format Article
id doaj.art-bd36fe6561e9499db9a8452386a62996
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-04-13T17:47:17Z
publishDate 2014-07-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-bd36fe6561e9499db9a8452386a629962022-12-22T02:36:53ZengF1000 Research LtdF1000Research2046-14022014-07-01210.12688/f1000research.2-93.v24848Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]Adwait Govind Joshi0Upadhyayula Surya Raghavender1Ramanathan Sowdhamini2Manipal University, Manipal, Karnataka, 576104, IndiaNational Centre for Biological Sciences (Tata Institute of Fundamental Research), Gandhi Krishi Vignyan Kendra Campus, Bangalore, 560065, IndiaNational Centre for Biological Sciences (Tata Institute of Fundamental Research), Gandhi Krishi Vignyan Kendra Campus, Bangalore, 560065, IndiaThe protein sequence space is vast and diverse, spanning across different families. Biologically meaningful relationships exist between proteins at superfamily level. However, it is highly challenging to establish convincing relationships at the superfamily level by means of simple sequence searches. It is necessary to design a rigorous sequence search strategy to establish remote homology relationships and achieve high coverage. We have used iterative profile-based methods, along with constraints of sequence motifs, to specify search directions. We address the importance of multiple start points (queries) to achieve high coverage at protein superfamily level. We have devised strategies to employ a structural regime to search sequence space with good specificity and sensitivity. We employ two well-known sequence search methods, PSI-BLAST and PHI-BLAST, with multiple queries and multiple patterns to enhance homologue identification at the structural superfamily level. The study suggests that multiple queries improve sensitivity, while a pattern-constrained iterative sequence search becomes stringent at the initial stages, thereby driving the search in a specific direction and also achieves high coverage. This data mining approach has been applied to the entire structural superfamily database.http://f1000research.com/articles/2-93/v2Bioinformatics
spellingShingle Adwait Govind Joshi
Upadhyayula Surya Raghavender
Ramanathan Sowdhamini
Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
F1000Research
Bioinformatics
title Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
title_full Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
title_fullStr Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
title_full_unstemmed Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
title_short Improved performance of sequence search approaches in remote homology detection [v2; ref status: indexed, http://f1000r.es/3qo]
title_sort improved performance of sequence search approaches in remote homology detection v2 ref status indexed http f1000r es 3qo
topic Bioinformatics
url http://f1000research.com/articles/2-93/v2
work_keys_str_mv AT adwaitgovindjoshi improvedperformanceofsequencesearchapproachesinremotehomologydetectionv2refstatusindexedhttpf1000res3qo
AT upadhyayulasuryaraghavender improvedperformanceofsequencesearchapproachesinremotehomologydetectionv2refstatusindexedhttpf1000res3qo
AT ramanathansowdhamini improvedperformanceofsequencesearchapproachesinremotehomologydetectionv2refstatusindexedhttpf1000res3qo