Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs

In synthetic biology, it is a challenge to increase the production of target proteins by maximizing their expression levels. In order to augment expression levels, we need to focus on both homologous recombination and codon adaptation, which are estimated by three objective functions, namely HD (Ham...

Full description

Bibliographic Details
Main Authors: Donghyeon Kim, Jinsung Kim
Format: Article
Language:English
Published: AIMS Press 2023-07-01
Series:Electronic Research Archive
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/era.2023270?viewType=HTML
_version_ 1827791766586130432
author Donghyeon Kim
Jinsung Kim
author_facet Donghyeon Kim
Jinsung Kim
author_sort Donghyeon Kim
collection DOAJ
description In synthetic biology, it is a challenge to increase the production of target proteins by maximizing their expression levels. In order to augment expression levels, we need to focus on both homologous recombination and codon adaptation, which are estimated by three objective functions, namely HD (Hamming distance), LRCS (length of repeated or common substring) and CAI (codon adaptation index). Optimizing these objective functions simultaneously becomes a multi-objective optimization problem. The aim is to find satisfying solutions that have high codon adaptation and a low incidence of homologous recombination. However, obtaining satisfactory solutions requires calculating the objective functions multiple times with many cycles and solutions. In this paper, we propose an approach to accelerate the method of designing a set of CDSs (CoDing sequences) based on NSGA-II (non-dominated sorting genetic algorithm II) on NVIDIA GPUs. The implementation accelerated by GPUs improves overall performance by 187.5$ \times $ using $ 100 $ cycles and $ 128 $ solutions. Our implementation allows us to use larger solutions and more cycles, leading to outstanding solution quality. The improved implementation provides much better solutions in a similar amount of time compared to other available methods by 1.22$ \times $ improvements in hypervolume. Furthermore, our approach on GPUs also suggests how to efficiently utilize the latest computational resources in bioinformatics. Finally, we discuss the impacts of the number of cycles and the number of solutions on designing a set of CDSs.
first_indexed 2024-03-11T17:51:33Z
format Article
id doaj.art-0014fb2a7db94b538129ee98350868ff
institution Directory Open Access Journal
issn 2688-1594
language English
last_indexed 2024-03-11T17:51:33Z
publishDate 2023-07-01
publisher AIMS Press
record_format Article
series Electronic Research Archive
spelling doaj.art-0014fb2a7db94b538129ee98350868ff2023-10-18T01:14:43ZengAIMS PressElectronic Research Archive2688-15942023-07-013195313533910.3934/era.2023270Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUsDonghyeon Kim0Jinsung Kim1School of Computer Science and Engineering, Chung-Ang University, Seoul, South KoreaSchool of Computer Science and Engineering, Chung-Ang University, Seoul, South KoreaIn synthetic biology, it is a challenge to increase the production of target proteins by maximizing their expression levels. In order to augment expression levels, we need to focus on both homologous recombination and codon adaptation, which are estimated by three objective functions, namely HD (Hamming distance), LRCS (length of repeated or common substring) and CAI (codon adaptation index). Optimizing these objective functions simultaneously becomes a multi-objective optimization problem. The aim is to find satisfying solutions that have high codon adaptation and a low incidence of homologous recombination. However, obtaining satisfactory solutions requires calculating the objective functions multiple times with many cycles and solutions. In this paper, we propose an approach to accelerate the method of designing a set of CDSs (CoDing sequences) based on NSGA-II (non-dominated sorting genetic algorithm II) on NVIDIA GPUs. The implementation accelerated by GPUs improves overall performance by 187.5$ \times $ using $ 100 $ cycles and $ 128 $ solutions. Our implementation allows us to use larger solutions and more cycles, leading to outstanding solution quality. The improved implementation provides much better solutions in a similar amount of time compared to other available methods by 1.22$ \times $ improvements in hypervolume. Furthermore, our approach on GPUs also suggests how to efficiently utilize the latest computational resources in bioinformatics. Finally, we discuss the impacts of the number of cycles and the number of solutions on designing a set of CDSs.https://www.aimspress.com/article/doi/10.3934/era.2023270?viewType=HTMLprotein encodingmulti-objective optimizationbioengineeringgpu computingnsga-ii
spellingShingle Donghyeon Kim
Jinsung Kim
Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
Electronic Research Archive
protein encoding
multi-objective optimization
bioengineering
gpu computing
nsga-ii
title Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
title_full Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
title_fullStr Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
title_full_unstemmed Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
title_short Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
title_sort optimization of designing multiple genes encoding the same protein based on nsga ii for efficient execution on gpus
topic protein encoding
multi-objective optimization
bioengineering
gpu computing
nsga-ii
url https://www.aimspress.com/article/doi/10.3934/era.2023270?viewType=HTML
work_keys_str_mv AT donghyeonkim optimizationofdesigningmultiplegenesencodingthesameproteinbasedonnsgaiiforefficientexecutionongpus
AT jinsungkim optimizationofdesigningmultiplegenesencodingthesameproteinbasedonnsgaiiforefficientexecutionongpus