Zaslat SMS: Learning attentional policies for tracking and recognition in video with deep networks