Summary: | Existing Siamese network based trackers are easily disturbed by large deformation, occlusion and distractor objects in the background. By comparing these trackers, we observe that the monotonous positive pairs usually have limited challenging factors (Occlusion, Deformation, etc.), which may make the learned features less robust. In addition, the foreground information of the substantial training data is utilized directly without deeper exploration. Thus, the trackers cannot effectively discriminate the foreground from the semantic backgrounds. In this paper, we focus on modifying the Siamese tracker by enriching the positive pairs and taking further advantage of the foreground information. During the offline training phase, a simple sampling strategy is adopted to enrich the challenging factors in positive pairs, which can effectively enhance the robustness of the tracker. At the same time, we highlight the foreground information by padding the background, and the information is utilized to generate a novel padding loss, which guides the tracker to pay less attention to the distractors in the background. Moreover, an improved feature information fusion is adopted to update the template, so that the tracker can adapt to the drastic appearance changes. Comprehensive experiments on the OTB and the VOT benchmarks demonstrate that our proposed tracker can achieve outstanding performance in both accuracy and robustness.