Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
For reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognitio...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/19/6876 |
_version_ | 1797552110401748992 |
---|---|
author | Jeong-Sik Park Hyeong-Ju Na |
author_facet | Jeong-Sik Park Hyeong-Ju Na |
author_sort | Jeong-Sik Park |
collection | DOAJ |
description | For reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognition architecture to control a single UAV. However, for stable voice-controlled driving, it is essential to handle the environmental conditions of UAVs carefully, including environmental noise that deteriorates recognition accuracy, and the operating scheme, e.g., how to direct a target vehicle among multiple UAVs and switch targets using speech commands. To handle these issues, we propose an efficient vehicle-embedded speech recognition front-end for multi-UAV control via voice. First, we propose a noise reduction approach that considers non-stationary noise in outdoor environments. The proposed method improves the conventional minimum mean squared error (MMSE) approach to handle non-stationary noises, e.g., babble and vehicle noises. In addition, we propose a multi-channel voice trigger method that can control multiple UAVs while efficiently directing and switching the target vehicle via speech commands. We evaluated the proposed methods on speech corpora, and the experimental results demonstrate that the proposed methods outperform the conventional approaches. In trigger word detection experiments, our approach yielded approximately 7%, 12%, and 3% relative improvements over spectral subtraction, adaptive comb filtering, and the conventional MMSE, respectively. In addition, the proposed multi-channel voice trigger approach achieved approximately 51% relative improvement over the conventional approach based on a single trigger word. |
first_indexed | 2024-03-10T15:55:18Z |
format | Article |
id | doaj.art-c982879b292e4eb78767fb0b237775cc |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T15:55:18Z |
publishDate | 2020-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-c982879b292e4eb78767fb0b237775cc2023-11-20T15:43:05ZengMDPI AGApplied Sciences2076-34172020-09-011019687610.3390/app10196876Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs ControlJeong-Sik Park0Hyeong-Ju Na1Department of English Linguistics and Language Technology, Hankuk University of Foreign Studies, Seoul 02450, KoreaDepartment of English Linguistics, Hankuk University of Foreign Studies, Seoul 02450, KoreaFor reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognition architecture to control a single UAV. However, for stable voice-controlled driving, it is essential to handle the environmental conditions of UAVs carefully, including environmental noise that deteriorates recognition accuracy, and the operating scheme, e.g., how to direct a target vehicle among multiple UAVs and switch targets using speech commands. To handle these issues, we propose an efficient vehicle-embedded speech recognition front-end for multi-UAV control via voice. First, we propose a noise reduction approach that considers non-stationary noise in outdoor environments. The proposed method improves the conventional minimum mean squared error (MMSE) approach to handle non-stationary noises, e.g., babble and vehicle noises. In addition, we propose a multi-channel voice trigger method that can control multiple UAVs while efficiently directing and switching the target vehicle via speech commands. We evaluated the proposed methods on speech corpora, and the experimental results demonstrate that the proposed methods outperform the conventional approaches. In trigger word detection experiments, our approach yielded approximately 7%, 12%, and 3% relative improvements over spectral subtraction, adaptive comb filtering, and the conventional MMSE, respectively. In addition, the proposed multi-channel voice trigger approach achieved approximately 51% relative improvement over the conventional approach based on a single trigger word.https://www.mdpi.com/2076-3417/10/19/6876speech recognitionvoice-driven controlnoise reductionvoice triggerunmanned aerial vehicle (UAV)multi-UAVs control |
spellingShingle | Jeong-Sik Park Hyeong-Ju Na Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control Applied Sciences speech recognition voice-driven control noise reduction voice trigger unmanned aerial vehicle (UAV) multi-UAVs control |
title | Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control |
title_full | Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control |
title_fullStr | Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control |
title_full_unstemmed | Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control |
title_short | Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control |
title_sort | front end of vehicle embedded speech recognition for voice driven multi uavs control |
topic | speech recognition voice-driven control noise reduction voice trigger unmanned aerial vehicle (UAV) multi-UAVs control |
url | https://www.mdpi.com/2076-3417/10/19/6876 |
work_keys_str_mv | AT jeongsikpark frontendofvehicleembeddedspeechrecognitionforvoicedrivenmultiuavscontrol AT hyeongjuna frontendofvehicleembeddedspeechrecognitionforvoicedrivenmultiuavscontrol |