Learning to Make Decisions in Robotic Manipulation

In order for human-assisting robots to be deployed in the real world such as household environments, challenges in two major scenarios remain to be solved. First, for common tasks that the robot conducts day-to-day, the execution of motion plans need to ensure the safety of surrounding objects and h...

Full description

Bibliographic Details
Main Author: Dai, Siyu
Other Authors: Williams, Brian C.
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144824
Description
Summary:In order for human-assisting robots to be deployed in the real world such as household environments, challenges in two major scenarios remain to be solved. First, for common tasks that the robot conducts day-to-day, the execution of motion plans need to ensure the safety of surrounding objects and humans. Second, to handle new tasks that some customers might occasionally demand, robots need to be able to learn novel tasks efficiently with a minimal amount of human supervision. In this thesis, we show that machine learning methods can be applied to solve challenges in both scenarios. In the first scenario, we propose learning-based p-Chekov, a chance-constrained motion planning approach that utilizes data-driven methods to obtain safe motion plans in real time. By pre-training a collision risk estimation model off-line instead of conducting online sampling-based risk estimation, learning-based p-Chekov is able to significantly improve the planning speed while maintaining the chance-constraint satisfaction performance. In the second scenario of learning new tasks, we first propose empowerment-based intrinsic motivation, a reinforcement learning (RL) approach that allows robots to learn novel tasks with only sparse or binary reward functions. Through maximizing the mutual dependence between robot actions and environment states, namely the empowerment, this intrinsic motivation helps the agent to focus more on the states where it can effectively ``control'' the environment during exploration instead of the parts where its actions cause random and unpredictable consequences. Empirical evaluations in different robotic manipulation environments with different shapes of the target object demonstrate that this empowerment-based intrinsic motivation approach can obtain higher extrinsic task rewards faster than other state-of-the-art solutions to sparse-reward RL tasks. Another approach we propose in the second scenario is automatic curricula via expert demonstrations (ACED), an imitation learning method that leverages the idea of curriculum learning and allows robots to learn long-horizon tasks when only provided with a handful of demonstration trajectories. Through moving the reset states from the end to the beginning of demonstrations as the learning agent improves its performance, ACED not only learns challenging manipulation tasks with unseen initializations and goals, but also discovers novel solutions that are distinct from the demonstrations. In addition, ACED can be naturally combined with other imitation learning methods to utilize expert demonstrations in a more efficient manner and allow robotic manipulators to learn novel tasks that other state-of-the-art automatic curriculum learning methods cannot learn. In the experiments presented in this thesis, we show that a combination of ACED with behavior cloning allows pick-and-place tasks to be learned with as few as one demonstration and block stacking tasks to be learned with twenty demonstrations.