High-Performance Visual Tracking Based on High-Order Pooling Network

Convolution Neural Network (CNN) features have been widely used in visual tracking due to their powerful representation. As an important component of CNN, the pooling layer plays a critical role, but the max/average/min operation only explores the first-order information, which limits the discrimina...

Full description

Bibliographic Details
Main Authors: Xinxi Feng, Lei Pu
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9899441/
Description
Summary:Convolution Neural Network (CNN) features have been widely used in visual tracking due to their powerful representation. As an important component of CNN, the pooling layer plays a critical role, but the max/average/min operation only explores the first-order information, which limits the discrimination ability of the CNN features in some complex situations. In this paper, a high-order pooling layer is integrated into the VGG16 network for visual tracking. In detail, a high-order covariance pooling layer is employed to replace the last maxpooling layer to learn discrimination features and is trained on the ImageNet and CUB200-2011 data sets. In tracking stage, the multiple levels of feature maps are extracted as the appearance representation of the target. After that, the extracted CNN features are integrated into the correlation filters framework when tracking is on-the-fly. The experimental results show that the proposed algorithm achieves excellent performance in both success rate and tracking accuracy.
ISSN:2169-3536