Text this: M2ER: Multimodal Emotion Recognition Based on Multi-Party Dialogue Scenarios