Text this: Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech