Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models

Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to d...

Full description

Bibliographic Details
Main Author: Dalius Matuzevičius
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/9/4393
_version_ 1797505700364025856
author Dalius Matuzevičius
author_facet Dalius Matuzevičius
author_sort Dalius Matuzevičius
collection DOAJ
description Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases.
first_indexed 2024-03-10T04:22:07Z
format Article
id doaj.art-9ef951671cae4c4e9af495d9d76fa3f5
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T04:22:07Z
publishDate 2022-04-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-9ef951671cae4c4e9af495d9d76fa3f52023-11-23T07:48:24ZengMDPI AGApplied Sciences2076-34172022-04-01129439310.3390/app12094393Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot ModelsDalius Matuzevičius0Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 03227 Vilnius, LithuaniaTwo-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases.https://www.mdpi.com/2076-3417/12/9/4393two-dimensional gel electrophoresis2DEGgel image analysisbioinformaticsprotein spot modelspot detection
spellingShingle Dalius Matuzevičius
Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
Applied Sciences
two-dimensional gel electrophoresis
2DEG
gel image analysis
bioinformatics
protein spot model
spot detection
title Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
title_full Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
title_fullStr Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
title_full_unstemmed Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
title_short Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
title_sort synthetic data generation for the development of 2d gel electrophoresis protein spot models
topic two-dimensional gel electrophoresis
2DEG
gel image analysis
bioinformatics
protein spot model
spot detection
url https://www.mdpi.com/2076-3417/12/9/4393
work_keys_str_mv AT daliusmatuzevicius syntheticdatagenerationforthedevelopmentof2dgelelectrophoresisproteinspotmodels