Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity

In the post-genome era, it is becoming more complex to process high dimensional, low-instance available, and nonlinear biological datasets. This paper aims to address these characteristics as they have adverse effects on the performance of predictive models in bioinformatics. In this paper, an inter...

Full description

Bibliographic Details
Main Authors:	Volkan Uslan, Huseyin Seker, Robert John
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Interval type-2 fuzzy systems support vector regression overlapping clusters peptide binding affinity clustering high-dimensionality
Online Access:	https://ieeexplore.ieee.org/document/8685099/

_version_	1818914473578070016
author	Volkan Uslan Huseyin Seker Robert John
author_facet	Volkan Uslan Huseyin Seker Robert John
author_sort	Volkan Uslan
collection	DOAJ
description	In the post-genome era, it is becoming more complex to process high dimensional, low-instance available, and nonlinear biological datasets. This paper aims to address these characteristics as they have adverse effects on the performance of predictive models in bioinformatics. In this paper, an interval type-2 Takagi Sugeno fuzzy predictive model is proposed in order to manage high-dimensionality and nonlinearity of such datasets which is the common feature in bioinformatics. A new clustering framework is proposed for this purpose to simplify antecedent operations for an interval type-2 fuzzy system. This new clustering framework is based on overlapping regions between the clusters. The cluster analysis of partitions and statistical information derived from them has identified the upper and lower membership functions forming the premise part. This is further enhanced by adapting the regression version of support vector machines in the consequent part. The proposed method is used in experiments to quantitatively predict affinities of peptide bindings to biomolecules. This case study imposes a challenge in post-genome studies and remains an open problem due to the complexity of the biological system, diversity of peptides, and curse of dimensionality of amino acid index representation characterizing the peptides. Utilizing four different peptide binding affinity datasets, the proposed method resulted in better generalization ability for all of them yielding an improved prediction accuracy of up to 58.2% on unseen peptides in comparison with the predictive methods presented in the literature. Source code of the algorithm is available at https://github.com/sekerbigdatalab.
first_indexed	2024-12-19T23:46:57Z
format	Article
id	doaj.art-cf5fffda85c44c1e90a89bcbe082472f
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-19T23:46:57Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-cf5fffda85c44c1e90a89bcbe082472f2022-12-21T20:01:16ZengIEEEIEEE Access2169-35362019-01-017497564976410.1109/ACCESS.2019.29100788685099Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding AffinityVolkan Uslan0https://orcid.org/0000-0001-6252-8853Huseyin Seker1Robert John2School of Computer Science and Informatics, De Montfort University, Leicester, U.K.Department of Computer Science and Digital Technologies, University of Northumbria at Newcastle, Newcastle upon Tyne, U.K.School of Computer Science, University of Nottingham, Nottingham, U.K.In the post-genome era, it is becoming more complex to process high dimensional, low-instance available, and nonlinear biological datasets. This paper aims to address these characteristics as they have adverse effects on the performance of predictive models in bioinformatics. In this paper, an interval type-2 Takagi Sugeno fuzzy predictive model is proposed in order to manage high-dimensionality and nonlinearity of such datasets which is the common feature in bioinformatics. A new clustering framework is proposed for this purpose to simplify antecedent operations for an interval type-2 fuzzy system. This new clustering framework is based on overlapping regions between the clusters. The cluster analysis of partitions and statistical information derived from them has identified the upper and lower membership functions forming the premise part. This is further enhanced by adapting the regression version of support vector machines in the consequent part. The proposed method is used in experiments to quantitatively predict affinities of peptide bindings to biomolecules. This case study imposes a challenge in post-genome studies and remains an open problem due to the complexity of the biological system, diversity of peptides, and curse of dimensionality of amino acid index representation characterizing the peptides. Utilizing four different peptide binding affinity datasets, the proposed method resulted in better generalization ability for all of them yielding an improved prediction accuracy of up to 58.2% on unseen peptides in comparison with the predictive methods presented in the literature. Source code of the algorithm is available at https://github.com/sekerbigdatalab.https://ieeexplore.ieee.org/document/8685099/Interval type-2 fuzzy systemssupport vector regressionoverlapping clusterspeptide binding affinityclusteringhigh-dimensionality
spellingShingle	Volkan Uslan Huseyin Seker Robert John Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity IEEE Access Interval type-2 fuzzy systems support vector regression overlapping clusters peptide binding affinity clustering high-dimensionality
title	Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity
title_full	Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity
title_fullStr	Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity
title_full_unstemmed	Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity
title_short	Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity
title_sort	overlapping clusters and support vector machines based interval type 2 fuzzy system for the prediction of peptide binding affinity
topic	Interval type-2 fuzzy systems support vector regression overlapping clusters peptide binding affinity clustering high-dimensionality
url	https://ieeexplore.ieee.org/document/8685099/
work_keys_str_mv	AT volkanuslan overlappingclustersandsupportvectormachinesbasedintervaltype2fuzzysystemforthepredictionofpeptidebindingaffinity AT huseyinseker overlappingclustersandsupportvectormachinesbasedintervaltype2fuzzysystemforthepredictionofpeptidebindingaffinity AT robertjohn overlappingclustersandsupportvectormachinesbasedintervaltype2fuzzysystemforthepredictionofpeptidebindingaffinity

Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity

Similar Items