Anfonwch hwn fel neges destun: Feature Fusion Based on Main-Auxiliary Network for Speech Emotion Recognition