The principles of building a machine-learning-based service for converting sequential code into parallel code

This article presents a novel approach for automating the parallelization of programming code using machine learning. The approach centers on a two-phase algorithm, incorporating a training phase and a transformation phase. In the training phase, a neural network is trained using data in the form of...

Full description

Bibliographic Details
Main Authors: Viktorov Ivan, Gibadullin Ruslan
Format: Article
Language:English
Published: EDP Sciences 2023-01-01
Series:E3S Web of Conferences
Online Access:https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/68/e3sconf_itse2023_05012.pdf
Description
Summary:This article presents a novel approach for automating the parallelization of programming code using machine learning. The approach centers on a two-phase algorithm, incorporating a training phase and a transformation phase. In the training phase, a neural network is trained using data in the form of Abstract Syntax Trees, with Word2Vec being employed as the primary model for converting the syntax tree into numerical arrays. The choice of Word2Vec is attributed to its efficacy in encoding words with less reliance on context, compared to other natural language processing models such as GloVe and FastText. During the transformation phase, the trained model is applied to new sequential code, transforming it into parallel programming code. The article discusses in detail the mechanisms behind the algorithm, the rationale for the selection of Word2Vec, and the subsequent processing of code data. This methodology introduces an intelligent, automated system capable of understanding and optimizing the syntactic and semantic structures of code for parallel computing environments. The article is relevant for researchers and practitioners seeking to enhance code optimization techniques through the integration of machine learning models.
ISSN:2267-1242