Summary: | While the rapid advancement of deep learning and grasp-affordance grasping has allowed the fast planning of grasping poses directly from visual inputs, it still commonly adopts an open-loop architecture that has made it slow to react and prone to failure, limiting its use in more complicated manipulation problems that require online adaptation and dexterous interactions. Although designing behavior tree may work as short-term hotfix for simple manipulation tasks, such approach is not scalable as the complexity of the desired skill increases. Modern deep reinforcement learning techniques have shown good performances in learning closed-loop policies for a selected sets of atomic manipulation skills, but they often require a prohibitive amount of training data and lack interpretability in the learned policy models. To reduce the cost of learning closed-loop manipulation controllers and facilitate more transparency, we propose to a model-based reinforcement learning algorithm. Our algorithm learns deep probabilistic hybrid automata (DPHA), a novel graphical model that learns to predict both low-level state evolution as well as high-level transitions among distinct modalities. We also show that the discrete modes that naturally arise when learning DPHA models can provide promising insights that reveal semantically meaningful intentions and discover potentially generalizable skills. We present a sampling-based model predictive control algorithm that leverages the DPHA model to plan for actions over spatially and temporally extended horizons. Our benchmark shows that these algorithms are capable of achieving comparable asymptotic performance with up to 10 times less training data compared to standard benchmark algorithms on pushing and grasping problems.
|