Integrated visual transformer and flash attention for lip-to-speech generation GAN
Abstract Lip-to-Speech (LTS) generation is an emerging technology that is highly visible, widely supported, and rapidly evolving. LTS has a wide range of promising applications, including assisting speech impairment and improving speech interaction in virtual assistants and robots. However, the tech...
Huvudupphovsmän: | , , , |
---|---|
Materialtyp: | Artikel |
Språk: | English |
Publicerad: |
Nature Portfolio
2024-02-01
|
Serie: | Scientific Reports |
Länkar: | https://doi.org/10.1038/s41598-024-55248-6 |