Summary: | Human beings communicate using more than verbal communication. To communicate smoothly, humans tend to use nonverbal information such as nodding during embodied interactions. Nodding is a form of affirmative gesture that is commonly used by humans worldwide while communicating with each other. The accurate detection of nodding is expected to support communication, analyze a conversation scene, and facilitate embodied interaction. In contrast, nodding is known to be related not only to the motion of the head but also to the rhythm of the voice. By using both head motion and voice rhythms, nodding can be estimated more accurately than methods that use only head motion or voice rhythms. Therefore, in this study, we develop a nodding detection system based on head motion and voice rhythm. In this system, the head motion of the listener is measured using face tracking. Then, the nodding motion is estimated using a neural network for head motion. Furthermore, the neural networks estimate the timings at which the listener is nodding by using the voice of the speaker. Subsequently, the nodding is estimated using logical OR and logical AND based on outputs of the head movement and speech rhythm neural networks. In addition, a neural network that integrates head motion and voice rhythm is developed. Furthermore, the effectiveness of the developed methods is demonstrated through evaluation experiments.
|