On‐device audio‐visual multi‐person wake word spotting
Abstract Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance. However, most audio‐visual wake word spotting models are only suitable for simple single‐speaker...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-12-01
|
Series: | CAAI Transactions on Intelligence Technology |
Subjects: | |
Online Access: | https://doi.org/10.1049/cit2.12189 |