Human robot interaction : speech recognition

The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded a...

Full description

Bibliographic Details
Main Author:	Tan, Roland Rustan.
Other Authors:	Lau Wai Shing, Michael
Format:	Final Year Project (FYP)
Language:	English
Published:	2011
Subjects:	DRNTU::Engineering::Mechanical engineering::Robots
Online Access:	http://hdl.handle.net/10356/42865

_version_	1826114233724567552
author	Tan, Roland Rustan.
author2	Lau Wai Shing, Michael
author_facet	Lau Wai Shing, Michael Tan, Roland Rustan.
author_sort	Tan, Roland Rustan.
collection	NTU
description	The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%.
first_indexed	2024-10-01T03:36:08Z
format	Final Year Project (FYP)
id	ntu-10356/42865
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T03:36:08Z
publishDate	2011
record_format	dspace
spelling	ntu-10356/428652023-03-04T19:01:57Z Human robot interaction : speech recognition Tan, Roland Rustan. Lau Wai Shing, Michael School of Mechanical and Aerospace Engineering Robotics Research Centre DRNTU::Engineering::Mechanical engineering::Robots The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%. Bachelor of Engineering (Mechanical Engineering) 2011-01-25T05:00:48Z 2011-01-25T05:00:48Z 2011 2011 Final Year Project (FYP) http://hdl.handle.net/10356/42865 en Nanyang Technological University 114 p. application/pdf
spellingShingle	DRNTU::Engineering::Mechanical engineering::Robots Tan, Roland Rustan. Human robot interaction : speech recognition
title	Human robot interaction : speech recognition
title_full	Human robot interaction : speech recognition
title_fullStr	Human robot interaction : speech recognition
title_full_unstemmed	Human robot interaction : speech recognition
title_short	Human robot interaction : speech recognition
title_sort	human robot interaction speech recognition
topic	DRNTU::Engineering::Mechanical engineering::Robots
url	http://hdl.handle.net/10356/42865
work_keys_str_mv	AT tanrolandrustan humanrobotinteractionspeechrecognition

Human robot interaction : speech recognition

Similar Items