Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control

For reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognitio...

Full description

Bibliographic Details
Main Authors: Jeong-Sik Park, Hyeong-Ju Na
Format: Article
Language:English
Published: MDPI AG 2020-09-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/19/6876
_version_ 1797552110401748992
author Jeong-Sik Park
Hyeong-Ju Na
author_facet Jeong-Sik Park
Hyeong-Ju Na
author_sort Jeong-Sik Park
collection DOAJ
description For reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognition architecture to control a single UAV. However, for stable voice-controlled driving, it is essential to handle the environmental conditions of UAVs carefully, including environmental noise that deteriorates recognition accuracy, and the operating scheme, e.g., how to direct a target vehicle among multiple UAVs and switch targets using speech commands. To handle these issues, we propose an efficient vehicle-embedded speech recognition front-end for multi-UAV control via voice. First, we propose a noise reduction approach that considers non-stationary noise in outdoor environments. The proposed method improves the conventional minimum mean squared error (MMSE) approach to handle non-stationary noises, e.g., babble and vehicle noises. In addition, we propose a multi-channel voice trigger method that can control multiple UAVs while efficiently directing and switching the target vehicle via speech commands. We evaluated the proposed methods on speech corpora, and the experimental results demonstrate that the proposed methods outperform the conventional approaches. In trigger word detection experiments, our approach yielded approximately 7%, 12%, and 3% relative improvements over spectral subtraction, adaptive comb filtering, and the conventional MMSE, respectively. In addition, the proposed multi-channel voice trigger approach achieved approximately 51% relative improvement over the conventional approach based on a single trigger word.
first_indexed 2024-03-10T15:55:18Z
format Article
id doaj.art-c982879b292e4eb78767fb0b237775cc
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T15:55:18Z
publishDate 2020-09-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-c982879b292e4eb78767fb0b237775cc2023-11-20T15:43:05ZengMDPI AGApplied Sciences2076-34172020-09-011019687610.3390/app10196876Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs ControlJeong-Sik Park0Hyeong-Ju Na1Department of English Linguistics and Language Technology, Hankuk University of Foreign Studies, Seoul 02450, KoreaDepartment of English Linguistics, Hankuk University of Foreign Studies, Seoul 02450, KoreaFor reliable speech recognition, it is necessary to handle the usage environments. In this study, we target voice-driven multi-unmanned aerial vehicles (UAVs) control. Although many studies have introduced several systems for voice-driven UAV control, most have focused on a general speech recognition architecture to control a single UAV. However, for stable voice-controlled driving, it is essential to handle the environmental conditions of UAVs carefully, including environmental noise that deteriorates recognition accuracy, and the operating scheme, e.g., how to direct a target vehicle among multiple UAVs and switch targets using speech commands. To handle these issues, we propose an efficient vehicle-embedded speech recognition front-end for multi-UAV control via voice. First, we propose a noise reduction approach that considers non-stationary noise in outdoor environments. The proposed method improves the conventional minimum mean squared error (MMSE) approach to handle non-stationary noises, e.g., babble and vehicle noises. In addition, we propose a multi-channel voice trigger method that can control multiple UAVs while efficiently directing and switching the target vehicle via speech commands. We evaluated the proposed methods on speech corpora, and the experimental results demonstrate that the proposed methods outperform the conventional approaches. In trigger word detection experiments, our approach yielded approximately 7%, 12%, and 3% relative improvements over spectral subtraction, adaptive comb filtering, and the conventional MMSE, respectively. In addition, the proposed multi-channel voice trigger approach achieved approximately 51% relative improvement over the conventional approach based on a single trigger word.https://www.mdpi.com/2076-3417/10/19/6876speech recognitionvoice-driven controlnoise reductionvoice triggerunmanned aerial vehicle (UAV)multi-UAVs control
spellingShingle Jeong-Sik Park
Hyeong-Ju Na
Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
Applied Sciences
speech recognition
voice-driven control
noise reduction
voice trigger
unmanned aerial vehicle (UAV)
multi-UAVs control
title Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
title_full Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
title_fullStr Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
title_full_unstemmed Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
title_short Front-End of Vehicle-Embedded Speech Recognition for Voice-Driven Multi-UAVs Control
title_sort front end of vehicle embedded speech recognition for voice driven multi uavs control
topic speech recognition
voice-driven control
noise reduction
voice trigger
unmanned aerial vehicle (UAV)
multi-UAVs control
url https://www.mdpi.com/2076-3417/10/19/6876
work_keys_str_mv AT jeongsikpark frontendofvehicleembeddedspeechrecognitionforvoicedrivenmultiuavscontrol
AT hyeongjuna frontendofvehicleembeddedspeechrecognitionforvoicedrivenmultiuavscontrol