Text this: Learning to Plan via Deep Optimistic Value Exploration