Study on the Use of Artificially Generated Objects in the Process of Training MLP Neural Networks Based on Dispersed Data

This study concerns dispersed data stored in independent local tables with different sets of attributes. The paper proposes a new method for training a single neural network—a multilayer perceptron based on dispersed data. The idea is to train local models that have identical structures based on loc...

Full description

Bibliographic Details
Main Authors: Kwabena Frimpong Marfo, Małgorzata Przybyła-Kasperek
Format: Article
Language:English
Published: MDPI AG 2023-04-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/5/703
Description
Summary:This study concerns dispersed data stored in independent local tables with different sets of attributes. The paper proposes a new method for training a single neural network—a multilayer perceptron based on dispersed data. The idea is to train local models that have identical structures based on local tables; however, due to different sets of conditional attributes present in local tables, it is necessary to generate some artificial objects to train local models. The paper presents a study on the use of varying parameter values in the proposed method of creating artificial objects to train local models. The paper presents an exhaustive comparison in terms of the number of artificial objects generated based on a single original object, the degree of data dispersion, data balancing, and different network structures—the number of neurons in the hidden layer. It was found that for data sets with a large number of objects, a smaller number of artificial objects is optimal. For smaller data sets, a greater number of artificial objects (three or four) produces better results. For large data sets, data balancing and the degree of dispersion have no significant impact on quality of classification. Rather, a greater number of neurons in the hidden layer produces better results (ranging from three to five times the number of neurons in the input layer).
ISSN:1099-4300