Oxford/IIIT TRECVID 2008 – notebook paper

The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A v...

Full description

Bibliographic Details
Main Authors: Philbin, J, Marin-Jimenez, M, Srinivasan, S, Zisserman, A, Jain, M, Vempati, S, Sankar, P, Jawahar, CV
Format: Conference item
Language:English
Published: National Institute of Standards and Technology 2008
_version_ 1824458929508188160
author Philbin, J
Marin-Jimenez, M
Srinivasan, S
Zisserman, A
Jain, M
Vempati, S
Sankar, P
Jawahar, CV
author_facet Philbin, J
Marin-Jimenez, M
Srinivasan, S
Zisserman, A
Jain, M
Vempati, S
Sankar, P
Jawahar, CV
author_sort Philbin, J
collection OXFORD
description The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. <br>For the high-level feature extraction task, we used two different approaches, both based on a combination of visual features. One used a SVM classifier using a linear combination of kernels, the other used a random forest classifier. For both methods, we trained all high-level features using publicly available annotations [3]. The advantage of the random forest classifier is the speed of training and testing. <br>In addition, for the people feature, we took a more targeted approach. We used a real-time face detector and an upper body detector, in both cases running on every frame. Our best performing submission, C_OXVGG_1_1, which used a rank fusion of our random forest and SVM approach, achieved an mAP of 0.101 and was above the median for all but one feature. <br>In the interactive search task, our team came third overall with an mAP of 0.158. The system used was identical to last year with the only change being a source of accurate upper body detections.
first_indexed 2025-02-19T04:33:42Z
format Conference item
id oxford-uuid:c2d78de7-b571-4f8d-8351-05b18b9b4809
institution University of Oxford
language English
last_indexed 2025-02-19T04:33:42Z
publishDate 2008
publisher National Institute of Standards and Technology
record_format dspace
spelling oxford-uuid:c2d78de7-b571-4f8d-8351-05b18b9b48092025-01-20T14:42:07ZOxford/IIIT TRECVID 2008 – notebook paperConference itemhttp://purl.org/coar/resource_type/c_5794uuid:c2d78de7-b571-4f8d-8351-05b18b9b4809EnglishSymplectic ElementsNational Institute of Standards and Technology2008Philbin, JMarin-Jimenez, MSrinivasan, SZisserman, AJain, MVempati, SSankar, PJawahar, CVThe Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. <br>For the high-level feature extraction task, we used two different approaches, both based on a combination of visual features. One used a SVM classifier using a linear combination of kernels, the other used a random forest classifier. For both methods, we trained all high-level features using publicly available annotations [3]. The advantage of the random forest classifier is the speed of training and testing. <br>In addition, for the people feature, we took a more targeted approach. We used a real-time face detector and an upper body detector, in both cases running on every frame. Our best performing submission, C_OXVGG_1_1, which used a rank fusion of our random forest and SVM approach, achieved an mAP of 0.101 and was above the median for all but one feature. <br>In the interactive search task, our team came third overall with an mAP of 0.158. The system used was identical to last year with the only change being a source of accurate upper body detections.
spellingShingle Philbin, J
Marin-Jimenez, M
Srinivasan, S
Zisserman, A
Jain, M
Vempati, S
Sankar, P
Jawahar, CV
Oxford/IIIT TRECVID 2008 – notebook paper
title Oxford/IIIT TRECVID 2008 – notebook paper
title_full Oxford/IIIT TRECVID 2008 – notebook paper
title_fullStr Oxford/IIIT TRECVID 2008 – notebook paper
title_full_unstemmed Oxford/IIIT TRECVID 2008 – notebook paper
title_short Oxford/IIIT TRECVID 2008 – notebook paper
title_sort oxford iiit trecvid 2008 notebook paper
work_keys_str_mv AT philbinj oxfordiiittrecvid2008notebookpaper
AT marinjimenezm oxfordiiittrecvid2008notebookpaper
AT srinivasans oxfordiiittrecvid2008notebookpaper
AT zissermana oxfordiiittrecvid2008notebookpaper
AT jainm oxfordiiittrecvid2008notebookpaper
AT vempatis oxfordiiittrecvid2008notebookpaper
AT sankarp oxfordiiittrecvid2008notebookpaper
AT jawaharcv oxfordiiittrecvid2008notebookpaper