Integrated visual transformer and flash attention for lip-to-speech generation GAN

Abstract Lip-to-Speech (LTS) generation is an emerging technology that is highly visible, widely supported, and rapidly evolving. LTS has a wide range of promising applications, including assisting speech impairment and improving speech interaction in virtual assistants and robots. However, the tech...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsmän: Qiong Yang, Yuxuan Bai, Feng Liu, Wei Zhang
Materialtyp: Artikel
Språk:English
Publicerad: Nature Portfolio 2024-02-01
Serie:Scientific Reports
Länkar:https://doi.org/10.1038/s41598-024-55248-6