Improved CNN-based mouth position and status detection

Mouth position and status detection system plays an important role in the auto-feeding system for paralyzed people. Through identifying the mouth status, whether it is open or close, and obtain the location of the open mouth, the system will be able to pick the correct timing to feed patients with r...

Full description

Bibliographic Details
Main Author: Chok, Yong Sheng
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/99383/1/ChokYongShengMSKE2022.pdf
Description
Summary:Mouth position and status detection system plays an important role in the auto-feeding system for paralyzed people. Through identifying the mouth status, whether it is open or close, and obtain the location of the open mouth, the system will be able to pick the correct timing to feed patients with robotic arms. There are two major problems that urge the proposal of this project. First, the existing mouth status recognition networks are built and executed on high-end and costly hardware. Second, the existing CNN mouth status related detection systems are less accurate, the highest accuracy in the researched work is only 86.8% for 3 states mouth status detection. Based on the problems, there are two research objectives that are strived to be achieved. First, to develop a high-accuracy and light CNN-based model for mouth status detection on Python platform. Second, to shorten the inference time of the CNN-based model by resizing some of the convolution layers. For methodology, the primary task is to train a mouth status detection CNN model with high accuracy. The face picture datasets fed to the model during CNN model training are diverse, covering different human races and shooting angles. YOLOv5 is chosen to be the pre-trained network due to its outstanding performance. The YOLOv5 backbone convolution layers are resized to shorten the inference time and reduce the model size. The developed CNN-based model achieved the targeted performance which is 96.8%, successfully improved inference time by 21.90% and model size by 13.20% as compared to the original model before enhancement.