Selecting Informative Data Samples for Model Learning Through Symbolic Regression

Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions...

Full description

Bibliographic Details
Main Authors:	Erik Derner, Jiri Kubalik, Robert Babuska
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Machine learning system identification robot control genetic algorithms symbolic regression
Online Access:	https://ieeexplore.ieee.org/document/9326312/

_version_	1818879934835195904
author	Erik Derner Jiri Kubalik Robert Babuska
author_facet	Erik Derner Jiri Kubalik Robert Babuska
author_sort	Erik Derner
collection	DOAJ
description	Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.
first_indexed	2024-12-19T14:37:58Z
format	Article
id	doaj.art-b18fcb26c2a74111a00272b8727a4578
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-19T14:37:58Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-b18fcb26c2a74111a00272b8727a45782022-12-21T20:17:12ZengIEEEIEEE Access2169-35362021-01-019141481415810.1109/ACCESS.2021.30521309326312Selecting Informative Data Samples for Model Learning Through Symbolic RegressionErik Derner0https://orcid.org/0000-0002-7588-7668Jiri Kubalik1Robert Babuska2https://orcid.org/0000-0001-9578-8598Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Prague, Czech RepublicCzech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Prague, Czech RepublicCzech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Prague, Czech RepublicContinual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.https://ieeexplore.ieee.org/document/9326312/Machine learningsystem identificationrobot controlgenetic algorithmssymbolic regression
spellingShingle	Erik Derner Jiri Kubalik Robert Babuska Selecting Informative Data Samples for Model Learning Through Symbolic Regression IEEE Access Machine learning system identification robot control genetic algorithms symbolic regression
title	Selecting Informative Data Samples for Model Learning Through Symbolic Regression
title_full	Selecting Informative Data Samples for Model Learning Through Symbolic Regression
title_fullStr	Selecting Informative Data Samples for Model Learning Through Symbolic Regression
title_full_unstemmed	Selecting Informative Data Samples for Model Learning Through Symbolic Regression
title_short	Selecting Informative Data Samples for Model Learning Through Symbolic Regression
title_sort	selecting informative data samples for model learning through symbolic regression
topic	Machine learning system identification robot control genetic algorithms symbolic regression
url	https://ieeexplore.ieee.org/document/9326312/
work_keys_str_mv	AT erikderner selectinginformativedatasamplesformodellearningthroughsymbolicregression AT jirikubalik selectinginformativedatasamplesformodellearningthroughsymbolicregression AT robertbabuska selectinginformativedatasamplesformodellearningthroughsymbolicregression

Selecting Informative Data Samples for Model Learning Through Symbolic Regression

Similar Items