SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data

A protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications includ...

Full description

Bibliographic Details
Main Authors: Marco Pinto Corujo, Adewale Olamoyesan, Anastasiia Tukova, Dale Ang, Erik Goormaghtigh, Jason Peterson, Victor Sharov, Nikola Chmel, Alison Rodger
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-01-01
Series:Frontiers in Chemistry
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/full
_version_ 1818333597135798272
author Marco Pinto Corujo
Adewale Olamoyesan
Anastasiia Tukova
Dale Ang
Erik Goormaghtigh
Jason Peterson
Victor Sharov
Nikola Chmel
Alison Rodger
Alison Rodger
author_facet Marco Pinto Corujo
Adewale Olamoyesan
Anastasiia Tukova
Dale Ang
Erik Goormaghtigh
Jason Peterson
Victor Sharov
Nikola Chmel
Alison Rodger
Alison Rodger
author_sort Marco Pinto Corujo
collection DOAJ
description A protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications including batch-batch comparisons of biopharmaceutical products. Circular dichroism is widely used for this purpose, but an alternative is required for concentrations above 10 mg/mL or for solutions with chiral buffer components that absorb far UV light. Infrared (IR) protein absorbance spectra of the Amide I region (1,600–1700 cm−1) contain information about secondary structure and require higher concentrations than circular dichroism often with complementary spectral windows. In this paper, we consider a number of approaches to extract structural information from a protein infrared spectrum and determine their reliability for regulatory and research purpose. In particular, we compare direct and second derivative band-fitting with a self-organising map (SOM) approach applied to a number of different reference sets. The self-organising map (SOM) approach proved significantly more accurate than the band-fitting approaches for solution spectra. As there is no validated benchmark method available for infrared structure fitting, SOMSpec was implemented in a leave-one-out validation (LOOV) approach for solid-state transmission and thin-film attenuated total reflectance (ATR) reference sets. We then tested SOMSpec and the thin-film ATR reference set against 68 solution spectra and found the average prediction error for helix (α + 310) and β-sheet was less than 6% for proteins with less than 40% helix. This is quantitatively better than other available approaches. The visual output format of SOMSpec aids identification of poor predictions. We also demonstrated how to convert aqueous ATR spectra to and from transmission spectra for structure fitting. Fourier self-deconvolution did not improve the average structure predictions.
first_indexed 2024-12-13T13:54:10Z
format Article
id doaj.art-6e1e161fcb4342b49452ec125a165d93
institution Directory Open Access Journal
issn 2296-2646
language English
last_indexed 2024-12-13T13:54:10Z
publishDate 2022-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Chemistry
spelling doaj.art-6e1e161fcb4342b49452ec125a165d932022-12-21T23:42:58ZengFrontiers Media S.A.Frontiers in Chemistry2296-26462022-01-01910.3389/fchem.2021.784625784625SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance DataMarco Pinto Corujo0Adewale Olamoyesan1Anastasiia Tukova2Dale Ang3Erik Goormaghtigh4Jason Peterson5Victor Sharov6Nikola Chmel7Alison Rodger8Alison Rodger9Department of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaCenter for Structural Biology and Bioinformatics, Laboratory for the Structure and Function of Biological Membranes, Campus Plaine, Université Libre de Bruxelles, Brussels, BelgiumBioPharmaSpec Inc., Malvern, PA, United StatesBioPharmaSpec Inc., Malvern, PA, United StatesDepartment of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Chemistry, University of Warwick, Coventry, United KingdomDepartment of Molecular Sciences, Macquarie University, Sydney, NSW, AustraliaA protein’s structure is the key to its function. As protein structure can vary with environment, it is important to be able to determine it over a wide range of concentrations, temperatures, formulation vehicles, and states. Robust reproducible validated methods are required for applications including batch-batch comparisons of biopharmaceutical products. Circular dichroism is widely used for this purpose, but an alternative is required for concentrations above 10 mg/mL or for solutions with chiral buffer components that absorb far UV light. Infrared (IR) protein absorbance spectra of the Amide I region (1,600–1700 cm−1) contain information about secondary structure and require higher concentrations than circular dichroism often with complementary spectral windows. In this paper, we consider a number of approaches to extract structural information from a protein infrared spectrum and determine their reliability for regulatory and research purpose. In particular, we compare direct and second derivative band-fitting with a self-organising map (SOM) approach applied to a number of different reference sets. The self-organising map (SOM) approach proved significantly more accurate than the band-fitting approaches for solution spectra. As there is no validated benchmark method available for infrared structure fitting, SOMSpec was implemented in a leave-one-out validation (LOOV) approach for solid-state transmission and thin-film attenuated total reflectance (ATR) reference sets. We then tested SOMSpec and the thin-film ATR reference set against 68 solution spectra and found the average prediction error for helix (α + 310) and β-sheet was less than 6% for proteins with less than 40% helix. This is quantitatively better than other available approaches. The visual output format of SOMSpec aids identification of poor predictions. We also demonstrated how to convert aqueous ATR spectra to and from transmission spectra for structure fitting. Fourier self-deconvolution did not improve the average structure predictions.https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/fullproteinsecondary structureinfrared absorbancevalidationself-organising map
spellingShingle Marco Pinto Corujo
Adewale Olamoyesan
Anastasiia Tukova
Dale Ang
Erik Goormaghtigh
Jason Peterson
Victor Sharov
Nikola Chmel
Alison Rodger
Alison Rodger
SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
Frontiers in Chemistry
protein
secondary structure
infrared absorbance
validation
self-organising map
title SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
title_full SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
title_fullStr SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
title_full_unstemmed SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
title_short SOMSpec as a General Purpose Validated Self-Organising Map Tool for Rapid Protein Secondary Structure Prediction From Infrared Absorbance Data
title_sort somspec as a general purpose validated self organising map tool for rapid protein secondary structure prediction from infrared absorbance data
topic protein
secondary structure
infrared absorbance
validation
self-organising map
url https://www.frontiersin.org/articles/10.3389/fchem.2021.784625/full
work_keys_str_mv AT marcopintocorujo somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT adewaleolamoyesan somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT anastasiiatukova somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT daleang somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT erikgoormaghtigh somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT jasonpeterson somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT victorsharov somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT nikolachmel somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT alisonrodger somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata
AT alisonrodger somspecasageneralpurposevalidatedselforganisingmaptoolforrapidproteinsecondarystructurepredictionfrominfraredabsorbancedata