Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models
Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to d...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-04-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/9/4393 |
_version_ | 1797505700364025856 |
---|---|
author | Dalius Matuzevičius |
author_facet | Dalius Matuzevičius |
author_sort | Dalius Matuzevičius |
collection | DOAJ |
description | Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases. |
first_indexed | 2024-03-10T04:22:07Z |
format | Article |
id | doaj.art-9ef951671cae4c4e9af495d9d76fa3f5 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T04:22:07Z |
publishDate | 2022-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-9ef951671cae4c4e9af495d9d76fa3f52023-11-23T07:48:24ZengMDPI AGApplied Sciences2076-34172022-04-01129439310.3390/app12094393Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot ModelsDalius Matuzevičius0Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 03227 Vilnius, LithuaniaTwo-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases.https://www.mdpi.com/2076-3417/12/9/4393two-dimensional gel electrophoresis2DEGgel image analysisbioinformaticsprotein spot modelspot detection |
spellingShingle | Dalius Matuzevičius Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models Applied Sciences two-dimensional gel electrophoresis 2DEG gel image analysis bioinformatics protein spot model spot detection |
title | Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models |
title_full | Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models |
title_fullStr | Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models |
title_full_unstemmed | Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models |
title_short | Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models |
title_sort | synthetic data generation for the development of 2d gel electrophoresis protein spot models |
topic | two-dimensional gel electrophoresis 2DEG gel image analysis bioinformatics protein spot model spot detection |
url | https://www.mdpi.com/2076-3417/12/9/4393 |
work_keys_str_mv | AT daliusmatuzevicius syntheticdatagenerationforthedevelopmentof2dgelelectrophoresisproteinspotmodels |