Title: A Study for Dynamically Adjustmentation for Exploitation Rate using Evaluation of Task Achievement

Issue Number: Vol. 8, No. 2
Year of Publication: Jun - 2018
Page Numbers: 53-60
Authors: Masashi SUGIMOTO
Journal Name: International Journal of New Computer Architectures and their Applications (IJNCAA)
- Hong Kong
DOI:  http://dx.doi.org/10.17781/P002402


Until now, in reinforcement learning, a ratio of a random action as known as exploration often has not been adjusted dynamically. However, this ratio will be an index of performance in the reinforcement learning. In this study, agents learn using information from the evaluation of achievement for task of another agent, will be suggested. From this proposed method, the exploration ratio will be adjusted from other agents behavior, dynamically. In Human Life, an “atmosphere” will be existed as a communication method. For example, empirically, people will be influenced by “serious atmosphere,” such as in the situation of working, or take an examination. In this study, this atmosphere as motivation for task achievement of agent will be defined. Moreover, in this study, agent’s action decision when another agent will be solved the task, will be focused on. In other words, an agent will be trying to find an optimal solution if other agents have been found an optimal solution. In this paper, we propose the action decision based on other agent’s behavior. Moreover, in this study, we discuss effectiveness using the maze problem as an example. In particular,“number of task achievement” and “influence for task achievement,” and how to achieve the task quantitative will be focused. As a result, we confirmed that the proposed method is well influenced from other agent’s behavior