End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC

End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC

Concomitant with the recent advances in deep learning, automatic speech recognition and visual speech recognition (VSR) have received considerable attention. However, although VSR systems must identify speech from both frontal and profile faces in real-world scenarios, most VSR studies have focused...

Full description

Bibliographic Details
Main Authors:	Sanghun Jeon, Mun Sang Kim
Format:	Article
Language:	English
Published:	MDPI AG 2022-05-01
Series:	Sensors
Subjects:	lipreading visual speech recognition multi-view VSR deep learning attention mechanism spatial attention module
Online Access:	https://www.mdpi.com/1424-8220/22/9/3597

Similar Items

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks
by: Tao Zhang, et al.
Published: (2021-07-01)

Improving Hybrid CTC/Attention Architecture with Time-Restricted Self-Attention CTC for End-to-End Speech Recognition
by: Long Wu, et al.
Published: (2019-10-01)

Lipreading Architecture Based on Multiple Convolutional Neural Networks for Sentence-Level Visual Speech Recognition
by: Sanghun Jeon, et al.
Published: (2021-12-01)

A Survey of Research on Lipreading Technology
by: Mingfeng Hao, et al.
Published: (2020-01-01)

A representation of abstract linguistic categories in the visual system underlies successful lipreading
by: Aaron R Nidiffer, et al.
Published: (2023-11-01)

Designing Sara Lipreading Test (No.2) and Implementing it in Hearing Adults
by: Gita Movallali, et al.
Published: (2003-07-01)

Lipreading a naturalistic narrative in a female population: Neural characteristics shared with listening and reading
by: Satu Saalasti, et al.
Published: (2023-02-01)

Learning the Relative Dynamic Features for Word-Level Lipreading
by: Hao Li, et al.
Published: (2022-05-01)

Lipreading Using Liquid State Machine with STDP-Tuning
by: Xuhu Yu, et al.
Published: (2022-10-01)

Multi-Angle Lipreading with Angle Classification-Based Feature Extraction and Its Application to Audio-Visual Speech Recognition
by: Shinnosuke Isobe, et al.
Published: (2021-07-01)

Alternative Visual Units for an Optimized Phoneme-Based Lipreading System
by: Helen L. Bear, et al.
Published: (2019-09-01)

Modality-Specific Perceptual Learning of Vocoded Auditory versus Lipread Speech: Different Effects of Prior Information
by: Lynne E. Bernstein, et al.
Published: (2023-06-01)

Multi-Attention-Guided Cascading Network for End-to-End Person Search
by: Jianxi Yang, et al.
Published: (2023-04-01)

Read my lips: Artificial intelligence word-level arabic lipreading system
by: Waleed Dweik, et al.
Published: (2022-12-01)

Dual-attention Network for View-invariant Action Recognition
by: Gedamu Alemu Kumie, et al.
Published: (2023-07-01)

Digital platform attention and international sales: an attention-based view
by: Li, Jingyu, et al.
Published: (2023)

Sub-convolutional U-Net with transformer attention network for end-to-end single-channel speech enhancement
by: Sivaramakrishna Yecchuri, et al.
Published: (2024-02-01)

Noise-Robust Multimodal Audio-Visual Speech Recognition System for Speech-Based Interaction Applications
by: Sanghun Jeon, et al.
Published: (2022-10-01)

End-to-End Automatic Pronunciation Error Detection Based on Improved Hybrid CTC/Attention Architecture
by: Long Zhang, et al.
Published: (2020-03-01)

Recurrent 3D attentional networks for end-to-end active object recognition
by: Min Liu, et al.
Published: (2019-04-01)

Artificial visual speech synchronized with a speech synthesis system /
by: 359756 Bothe, H. H., et al.

Improving Hybrid CTC/Attention Architecture for Agglutinative Language Speech Recognition
by: Zeyu Ren, et al.
Published: (2022-09-01)

Pre-Alignment Guided Attention for Improving Training Efficiency and Model Stability in End-to-End Speech Synthesis
by: Xiaolian Zhu, et al.
Published: (2019-01-01)

An Attention-Based Architecture for Hierarchical Classification With CNNs
by: Ivan Pizarro, et al.
Published: (2023-01-01)

AMCFCN: attentive multi-view contrastive fusion clustering net
by: Huarun Xiao, et al.
Published: (2024-03-01)

LCAM: Low-Complexity Attention Module for Lightweight Face Recognition Networks
by: Seng Chun Hoo, et al.
Published: (2023-04-01)

Video object segmentation via attention‐modulating networks
by: Runfa Tang, et al.
Published: (2019-04-01)

Neurofeedback Training of Auditory Selective Attention Enhances Speech-In-Noise Perception
by: Subong Kim, et al.
Published: (2021-06-01)

Abstractive Sentence Compression with Event Attention
by: Su Jeong Choi, et al.
Published: (2019-09-01)

Multibranch Attention Mechanism Based on Channel and Spatial Attention Fusion
by: Guojun Mao, et al.
Published: (2022-11-01)

An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention
by: Yong Hong, et al.
Published: (2022-12-01)

Deep CNNs With Self-Attention for Speaker Identification
by: Nguyen Nang An, et al.
Published: (2019-01-01)

Cross-View Gait Recognition Model Combining Multi-Scale Feature Residual Structure and Self-Attention Mechanism
by: Jingxue Wang, et al.
Published: (2023-01-01)

An End to End Framework With Adaptive Spatio-Temporal Attention Module for Human Action Recognition
by: Shaocan Liu, et al.
Published: (2020-01-01)

A-SATMVSNet: An attention-aware multi-view stereo matching network based on satellite imagery
by: Li Lin, et al.
Published: (2023-04-01)

MS-ALN: Multiscale Attention Learning Network for Pest Recognition
by: Fuxiang Feng, et al.
Published: (2022-01-01)

Multiscale Cascaded Attention Network for Saliency Detection Based on ResNet
by: Muwei Jian, et al.
Published: (2022-12-01)

Exploiting a Spatial Attention Mechanism for Improved Depth Completion and Feature Fusion in Novel View Synthesis
by: Anh Minh Truong, et al.
Published: (2024-01-01)

Attention-based latent features for jointly trained end-to-end automatic speech recognition with modified speech enhancement
by: Da-Hee Yang, et al.
Published: (2023-03-01)

Fully Attentional Network for Skeleton-Based Action Recognition
by: Caifeng Liu, et al.
Published: (2023-01-01)