Stav dette: Evolving heterotic gauge backgrounds: genetic algorithms versus reinforcement learning