MTL-FFDET: A Multi-Task Learning-Based Model for Forest Fire Detection

Deep learning-based forest fire vision monitoring methods have developed rapidly and are becoming mainstream. The existing methods, however, are based on enormous amounts of data, and have issues with weak feature extraction, poor small target recognition and many missed and false detections in comp...

Full description

Bibliographic Details
Main Authors: Kangjie Lu, Jingwen Huang, Junhui Li, Jiashun Zhou, Xianliang Chen, Yunfei Liu
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:Forests
Subjects:
Online Access:https://www.mdpi.com/1999-4907/13/9/1448
Description
Summary:Deep learning-based forest fire vision monitoring methods have developed rapidly and are becoming mainstream. The existing methods, however, are based on enormous amounts of data, and have issues with weak feature extraction, poor small target recognition and many missed and false detections in complex forest scenes. In order to solve these problems, we proposed a multi-task learning-based forest fire detection model (MTL-FFDet), which contains three tasks (the detection task, the segmentation task and the classification task) and shares the feature extraction module. In addition, to improve detection accuracy and decrease missed and false detections, we proposed the joint multi-task non-maximum suppression (NMS) processing algorithm that fully utilizes the advantages of each task. Furthermore, considering the objective fact that divided flame targets in an image are still flame targets, our proposed data augmentation strategy of a diagonal swap of random origin is a good remedy for the poor detection effect caused by small fire targets. Experiments showed that our model outperforms YOLOv5-s in terms of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>m</mi><mi>A</mi><mi>P</mi></mrow></semantics></math></inline-formula> (mean average precision) by 3.2%, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>P</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula> (average precision for small objects) by 4.8%, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>R</mi><mi>S</mi></msub></mrow></semantics></math></inline-formula> (average recall for small objects) by 4.0%, and other metrics by 1% to 2%. Finally, the visualization analysis showed that our multi-task model can focus on the target region better than the single-task model during feature extraction, with superior extraction ability.
ISSN:1999-4907