أرسل هذا في رسالة قصيرة: Learning attentional policies for tracking and recognition in video with deep networks