Continuous lipreading based on acoustic temporal alignments
Abstract Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art employs powerful end-to-end architectures based on deep learning which depend on large amounts of data and high computational resources for their...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2024-05-01
|
Series: | EURASIP Journal on Audio, Speech, and Music Processing |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13636-024-00345-7 |