A Speech Recognition Model Building Method Combined Dynamic Convolution and Multi-Head Self-Attention Mechanism

A Speech Recognition Model Building Method Combined Dynamic Convolution and Multi-Head Self-Attention Mechanism

The Conformer enhanced Transformer by using convolution serial connected to the multi-head self-attention (MHSA). The method strengthened the local attention calculation and obtained a better effect in auto speech recognition. This paper proposes a hybrid attention mechanism which combines the dynam...

Full description

Bibliographic Details
Main Authors:	Wei Liu, Jiaming Sun, Yiming Sun, Chunyi Chen
Format:	Article
Language:	English
Published:	MDPI AG 2022-05-01
Series:	Electronics
Subjects:	speech recognition attention dynamic convolution transformer
Online Access:	https://www.mdpi.com/2079-9292/11/10/1656

Similar Items

Speech Emotion Recognition Using Convolution Neural Networks and Multi-Head Convolutional Transformer
by: Rizwan Ullah, et al.
Published: (2023-07-01)

Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
by: XU Ming-ke, ZHANG Fan
Published: (2022-07-01)

NSE-CATNet: Deep Neural Speech Enhancement Using Convolutional Attention Transformer Network
by: Nasir Saleem, et al.
Published: (2023-01-01)

LAS-Transformer: An Enhanced Transformer Based on the Local Attention Mechanism for Speech Recognition
by: Pengbin Fu, et al.
Published: (2022-05-01)

Speech Emotion Recognition Based on Self-Attention Weight Correction for Acoustic and Text Features
by: Jennifer Santoso, et al.
Published: (2022-01-01)

Time-domain adaptive attention network for single-channel speech separation
by: Kunpeng Wang, et al.
Published: (2023-05-01)

Multi-Attention Bottleneck for Gated Convolutional Encoder-Decoder-Based Speech Enhancement
by: Nasir Saleem, et al.
Published: (2023-01-01)

Modeling Speech Emotion Recognition via Attention-Oriented Parallel CNN Encoders
by: Fazliddin Makhmudov, et al.
Published: (2022-12-01)

Attention-LSTM-Attention Model for Speech Emotion Recognition and Analysis of IEMOCAP Database
by: Yeonguk Yu, et al.
Published: (2020-04-01)

Sub-convolutional U-Net with transformer attention network for end-to-end single-channel speech enhancement
by: Sivaramakrishna Yecchuri, et al.
Published: (2024-02-01)

The Impact of Attention Mechanisms on Speech Emotion Recognition
by: Shouyan Chen, et al.
Published: (2021-11-01)

Attention-Based Multi-Learning Approach for Speech Emotion Recognition With Dilated Convolution
by: Samuel Kakuba, et al.
Published: (2022-01-01)

Cloudformer: A Cloud-Removal Network Combining Self-Attention Mechanism and Convolution
by: Peiyang Wu, et al.
Published: (2022-12-01)

Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Multi-Attention Module through Speech Spectrograms
by: Anvarjon Tursunov, et al.
Published: (2021-09-01)

Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network
by: Shervin Minaee, et al.
Published: (2021-04-01)

Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
by: Nikola Simić, et al.
Published: (2022-03-01)

Automatic Speech Recognition Method Based on Deep Learning Approaches for Uzbek Language
by: Abdinabi Mukhamadiyev, et al.
Published: (2022-05-01)

Speech Emotion Recognition Using Convolutional Neural Networks with Attention Mechanism
by: Konstantinos Mountzouris, et al.
Published: (2023-10-01)

Fusion-ConvBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition
by: Sanghyun Lee, et al.
Published: (2020-11-01)

Research on Transportation Mode Recognition Based on Multi-Head Attention Temporal Convolutional Network
by: Shuyu Cheng, et al.
Published: (2023-03-01)

Efficient Self-Attention Model for Speech Recognition-Based Assistive Robots Control
by: Samuel Poirier, et al.
Published: (2023-06-01)

A Bidirectional Context Embedding Transformer for Automatic Speech Recognition
by: Lyuchao Liao, et al.
Published: (2022-01-01)

Adaptive Attention Memory Graph Convolutional Networks for Skeleton-Based Action Recognition
by: Di Liu, et al.
Published: (2021-10-01)

Speech Emotion Recognition Based on Parallel CNN-Attention Networks with Multi-Fold Data Augmentation
by: John Lorenzo Bautista, et al.
Published: (2022-11-01)

MixFormer: A Self-Attentive Convolutional Network for 3D Mesh Object Recognition
by: Lingfeng Huang, et al.
Published: (2023-03-01)

A BiLSTM–Transformer and 2D CNN Architecture for Emotion Recognition from Speech
by: Sera Kim, et al.
Published: (2023-09-01)

Local Multi-Head Channel Self-Attention for Facial Expression Recognition
by: Roberto Pecoraro, et al.
Published: (2022-09-01)

Multi-Stream Convolution-Recurrent Neural Networks Based on Attention Mechanism Fusion for Speech Emotion Recognition
by: Huawei Tao, et al.
Published: (2022-07-01)

Self Attention Networks in Speaker Recognition
by: Pooyan Safari, et al.
Published: (2023-05-01)

Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition
by: Minji Seo, et al.
Published: (2020-09-01)

Skeleton-Based Emotion Recognition Based on Two-Stream Self-Attention Enhanced Spatial-Temporal Graph Convolutional Network
by: Jiaqi Shi, et al.
Published: (2020-12-01)

Low‐latency transformer model for streaming automatic speech recognition
by: Haoran Miao, et al.
Published: (2022-01-01)

Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
by: Ala Saleh Alluhaidan, et al.
Published: (2023-04-01)

Multimodal EEG Emotion Recognition Based on the Attention Recurrent Graph Convolutional Network
by: Jingxia Chen, et al.
Published: (2022-11-01)

End-to-End Mandarin Speech Recognition Combining CNN and BLSTM
by: Dong Wang, et al.
Published: (2019-05-01)

Multi-Head Self-Attention Gated-Dilated Convolutional Neural Network for Word Sense Disambiguation
by: Chun-Xiang Zhang, et al.
Published: (2023-01-01)

Speech recognition /
by: 207859 Mohammad Masroor Ahmed, et al.
Published: (2004)

Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
by: Marta Zielonka, et al.
Published: (2022-11-01)

Attention-based latent features for jointly trained end-to-end automatic speech recognition with modified speech enhancement
by: Da-Hee Yang, et al.
Published: (2023-03-01)

Speech synthesis and recognition/
by: 421175 Holmes, J. N.
Published: (1988)