Human robot interaction : speech recognition

The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded a...

Full description

Bibliographic Details
Main Author: Tan, Roland Rustan.
Other Authors: Lau Wai Shing, Michael
Format: Final Year Project (FYP)
Language:English
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/10356/42865
_version_ 1826114233724567552
author Tan, Roland Rustan.
author2 Lau Wai Shing, Michael
author_facet Lau Wai Shing, Michael
Tan, Roland Rustan.
author_sort Tan, Roland Rustan.
collection NTU
description The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%.
first_indexed 2024-10-01T03:36:08Z
format Final Year Project (FYP)
id ntu-10356/42865
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:36:08Z
publishDate 2011
record_format dspace
spelling ntu-10356/428652023-03-04T19:01:57Z Human robot interaction : speech recognition Tan, Roland Rustan. Lau Wai Shing, Michael School of Mechanical and Aerospace Engineering Robotics Research Centre DRNTU::Engineering::Mechanical engineering::Robots The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%. Bachelor of Engineering (Mechanical Engineering) 2011-01-25T05:00:48Z 2011-01-25T05:00:48Z 2011 2011 Final Year Project (FYP) http://hdl.handle.net/10356/42865 en Nanyang Technological University 114 p. application/pdf
spellingShingle DRNTU::Engineering::Mechanical engineering::Robots
Tan, Roland Rustan.
Human robot interaction : speech recognition
title Human robot interaction : speech recognition
title_full Human robot interaction : speech recognition
title_fullStr Human robot interaction : speech recognition
title_full_unstemmed Human robot interaction : speech recognition
title_short Human robot interaction : speech recognition
title_sort human robot interaction speech recognition
topic DRNTU::Engineering::Mechanical engineering::Robots
url http://hdl.handle.net/10356/42865
work_keys_str_mv AT tanrolandrustan humanrobotinteractionspeechrecognition