Text this: Generative inverse reinforcement learning for learning 2-opt heuristics without extrinsic rewards in routing problems