关键词:机器人;手臂系统;模拟
摘 要:This chapter describes a novel methodology for behavior learning of an agent, called Coaching. The proposed method is an interactive and iterative learning method which allows a human trainer to give a subjective evaluation to the robotic agent in real time, and the agent can update the reward function dynamically based on this evaluation simultaneously. We demonstrated that the agent is capable of learning the desired behavior by receiving simple and subjective instructions such as positive and negative. The proposed approach is also effective when it is difficult to determine a suitable reward function for the learning situation in advance. We have conducted several experiments with a simulated and a real robot arm system, and the advantage of the proposed method is verified throughout those experiments.