SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
A protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications includ...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-01-01
|
Series: | Frontiers in Chemistry |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/full |
_version_ | 1818333597135798272 |
---|---|
author | Marco Pinto Corujo Adewale Olamoyesan Anastasiia Tukova Dale Ang Erik Goormaghtigh Jason Peterson Victor Sharov Nikola Chmel Alison Rodger Alison Rodger |
author_facet | Marco Pinto Corujo Adewale Olamoyesan Anastasiia Tukova Dale Ang Erik Goormaghtigh Jason Peterson Victor Sharov Nikola Chmel Alison Rodger Alison Rodger |
author_sort | Marco Pinto Corujo |
collection | DOAJ |
description | A protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications including batch-batch comparisons of biopharmaceutical products. Circular dichroism is widely used for this purpose, but an alternative is required for concentrations above 10 mg/mL or for solutions with chiral buffer components that absorb far UV light. Infrared (IR) protein absorbance spectra of the Amide I region (1,600–1700 cm−1) contain information about secondary structure and require higher concentrations than circular dichroism often with complementary spectral windows. In this paper, we consider a number of approaches to extract structural information from a protein infrared spectrum and determine their reliability for regulatory and research purpose. In particular, we compare direct and second derivative band-fitting with a self-organising map (SOM) approach applied to a number of different reference sets. The self-organising map (SOM) approach proved significantly more accurate than the band-fitting approaches for solution spectra. As there is no validated benchmark method available for infrared structure fitting, SOMSpec was implemented in a leave-one-out validation (LOOV) approach for solid-state transmission and thin-film attenuated total reflectance (ATR) reference sets. We then tested SOMSpec and the thin-film ATR reference set against 68 solution spectra and found the average prediction error for helix (α + 310) and β-sheet was less than 6% for proteins with less than 40% helix. This is quantitatively better than other available approaches. The visual output format of SOMSpec aids identification of poor predictions. We also demonstrated how to convert aqueous ATR spectra to and from transmission spectra for structure fitting. Fourier self-deconvolution did not improve the average structure predictions. |
first_indexed | 2024-12-13T13:54:10Z |
format | Article |
id | doaj.art-6e1e161fcb4342b49452ec125a165d93 |
institution | Directory Open Access Journal |
issn | 2296-2646 |
language | English |
last_indexed | 2024-12-13T13:54:10Z |
publishDate | 2022-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Chemistry |
spelling | doaj.art-6e1e161fcb4342b49452ec125a165d932022-12-21T23:42:58ZengFrontiers Media S.A.Frontiers in Chemistry2296-26462022-01-01910.3389/fchem.2021.784625784625SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance DataMarco Pinto Corujo0Adewale Olamoyesan1Anastasiia Tukova2Dale Ang3Erik Goormaghtigh4Jason Peterson5Victor Sharov6Nikola Chmel7Alison Rodger8Alison Rodger9Department of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaCenter for Structural Biology and Bioinformatics, Laboratory for the Structure and Function of Biological Membranes, Campus Plaine, Université Libre de Bruxelles, Brussels, BelgiumBioPharmaSpec Inc., Malvern, PA, United StatesBioPharmaSpec Inc., Malvern, PA, United StatesDepartment of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaA protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications including batch-batch comparisons of biopharmaceutical products. Circular dichroism is widely used for this purpose, but an alternative is required for concentrations above 10 mg/mL or for solutions with chiral buffer components that absorb far UV light. Infrared (IR) protein absorbance spectra of the Amide I region (1,600–1700 cm−1) contain information about secondary structure and require higher concentrations than circular dichroism often with complementary spectral windows. In this paper, we consider a number of approaches to extract structural information from a protein infrared spectrum and determine their reliability for regulatory and research purpose. In particular, we compare direct and second derivative band-fitting with a self-organising map (SOM) approach applied to a number of different reference sets. The self-organising map (SOM) approach proved significantly more accurate than the band-fitting approaches for solution spectra. As there is no validated benchmark method available for infrared structure fitting, SOMSpec was implemented in a leave-one-out validation (LOOV) approach for solid-state transmission and thin-film attenuated total reflectance (ATR) reference sets. We then tested SOMSpec and the thin-film ATR reference set against 68 solution spectra and found the average prediction error for helix (α + 310) and β-sheet was less than 6% for proteins with less than 40% helix. This is quantitatively better than other available approaches. The visual output format of SOMSpec aids identification of poor predictions. We also demonstrated how to convert aqueous ATR spectra to and from transmission spectra for structure fitting. Fourier self-deconvolution did not improve the average structure predictions.https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/fullproteinsecondary structureinfrared absorbancevalidationself-organising map |
spellingShingle | Marco Pinto Corujo Adewale Olamoyesan Anastasiia Tukova Dale Ang Erik Goormaghtigh Jason Peterson Victor Sharov Nikola Chmel Alison Rodger Alison Rodger SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data Frontiers in Chemistry protein secondary structure infrared absorbance validation self-organising map |
title | SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data |
title_full | SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data |
title_fullStr | SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data |
title_full_unstemmed | SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data |
title_short | SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data |
title_sort | somspec as a general purpose validated self organising map tool for rapid protein secondary structure prediction from infrared absorbance data |
topic | protein secondary structure infrared absorbance validation self-organising map |
url | https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/full |
work_keys_str_mv | AT marcopintocorujo somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT adewaleolamoyesan somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT anastasiiatukova somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT daleang somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT erikgoormaghtigh somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT jasonpeterson somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT victorsharov somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT nikolachmel somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT alisonrodger somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata AT alisonrodger somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata |