Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency
In recent years, the recommendation system and robot learning are undoubtedly the two most popular application fields, and the core algorithms supporting these two fields are deep learning based on perception and reinforcement learning based on exploration learning, respectively. How to combine thes...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/11/5314 |
_version_ | 1827665782578872320 |
---|---|
author | Huaidong Yu Jian Yin |
author_facet | Huaidong Yu Jian Yin |
author_sort | Huaidong Yu |
collection | DOAJ |
description | In recent years, the recommendation system and robot learning are undoubtedly the two most popular application fields, and the core algorithms supporting these two fields are deep learning based on perception and reinforcement learning based on exploration learning, respectively. How to combine these two fields to better improve the development of the whole machine learning field is the dream of numerous researchers. The Deep Reinforcement Network (DRN) model successfully embedded reinforcement learning into the recommendation system, which provided a good idea for subsequent researchers. However, the disadvantage is also obvious, that is, the DRN model is built for news recommendations, meaning that the DRN model is not transferable, which is also the defect of many current recommendation system models. Meanwhile, the agent learning method adopted by the DRN model is primitive and inefficient. Among many models and algorithms that have emerged in recent years, we use the newly proposed <b><i>deployment efficiency</i></b> to measure their comprehensive quality and found that few models focus on both efficiency and performance improvement. To fill the gap of model deployment efficiency neglected by many researchers and to create a model of reinforcement learning agents with stronger performance, we have been exploring and trying to complete research on the Gate Attentional Factorization Machines (GAFM) model. Finally, we successfully integrated the GAFM model and reinforcement learning. The Deep Reinforcement Factorization Machines (DRFM) model proposed in this paper is based on the combination of deep learning with high perception ability and reinforcement learning with high exploration ability, centered on improving the deployment efficiency and learning performance of the model. The GAFM model is modified and upgraded using multidisciplinary techniques, and a new model-based random exploration strategy is proposed to update and optimize the recommendation list efficiently. Through parallel contrast experiments on various datasets, it is proved that the DRFM model surpasses the traditional recommendation system model in all aspects. The DRFM model is far superior to other models in terms of performance and robustness, and also significantly improved in terms of deployment efficiency. At the same time, we conduct a comparative analysis with the latest deep reinforcement learning algorithm and prove the unique advantages of the DRFM model. |
first_indexed | 2024-03-10T01:32:55Z |
format | Article |
id | doaj.art-bd34d61b82a245e0b42611f7d66ef62f |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T01:32:55Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-bd34d61b82a245e0b42611f7d66ef62f2023-11-23T13:39:33ZengMDPI AGApplied Sciences2076-34172022-05-011211531410.3390/app12115314Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment EfficiencyHuaidong Yu0Jian Yin1School of Mechanical and Information Engineering, Shandong University, Weihai 264209, ChinaSchool of Mechanical and Information Engineering, Shandong University, Weihai 264209, ChinaIn recent years, the recommendation system and robot learning are undoubtedly the two most popular application fields, and the core algorithms supporting these two fields are deep learning based on perception and reinforcement learning based on exploration learning, respectively. How to combine these two fields to better improve the development of the whole machine learning field is the dream of numerous researchers. The Deep Reinforcement Network (DRN) model successfully embedded reinforcement learning into the recommendation system, which provided a good idea for subsequent researchers. However, the disadvantage is also obvious, that is, the DRN model is built for news recommendations, meaning that the DRN model is not transferable, which is also the defect of many current recommendation system models. Meanwhile, the agent learning method adopted by the DRN model is primitive and inefficient. Among many models and algorithms that have emerged in recent years, we use the newly proposed <b><i>deployment efficiency</i></b> to measure their comprehensive quality and found that few models focus on both efficiency and performance improvement. To fill the gap of model deployment efficiency neglected by many researchers and to create a model of reinforcement learning agents with stronger performance, we have been exploring and trying to complete research on the Gate Attentional Factorization Machines (GAFM) model. Finally, we successfully integrated the GAFM model and reinforcement learning. The Deep Reinforcement Factorization Machines (DRFM) model proposed in this paper is based on the combination of deep learning with high perception ability and reinforcement learning with high exploration ability, centered on improving the deployment efficiency and learning performance of the model. The GAFM model is modified and upgraded using multidisciplinary techniques, and a new model-based random exploration strategy is proposed to update and optimize the recommendation list efficiently. Through parallel contrast experiments on various datasets, it is proved that the DRFM model surpasses the traditional recommendation system model in all aspects. The DRFM model is far superior to other models in terms of performance and robustness, and also significantly improved in terms of deployment efficiency. At the same time, we conduct a comparative analysis with the latest deep reinforcement learning algorithm and prove the unique advantages of the DRFM model.https://www.mdpi.com/2076-3417/12/11/5314deep reinforcement learningdeployment efficiencyperceptionfactorization machinerandom explorationself-learning |
spellingShingle | Huaidong Yu Jian Yin Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency Applied Sciences deep reinforcement learning deployment efficiency perception factorization machine random exploration self-learning |
title | Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency |
title_full | Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency |
title_fullStr | Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency |
title_full_unstemmed | Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency |
title_short | Deep Reinforcement Factorization Machines: A Deep Reinforcement Learning Model with Random Exploration Strategy and High Deployment Efficiency |
title_sort | deep reinforcement factorization machines a deep reinforcement learning model with random exploration strategy and high deployment efficiency |
topic | deep reinforcement learning deployment efficiency perception factorization machine random exploration self-learning |
url | https://www.mdpi.com/2076-3417/12/11/5314 |
work_keys_str_mv | AT huaidongyu deepreinforcementfactorizationmachinesadeepreinforcementlearningmodelwithrandomexplorationstrategyandhighdeploymentefficiency AT jianyin deepreinforcementfactorizationmachinesadeepreinforcementlearningmodelwithrandomexplorationstrategyandhighdeploymentefficiency |