Oxford/IIIT TRECVID 2008 – notebook paper
The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A v...
Main Authors: | , , , , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
National Institute of Standards and Technology
2008
|
_version_ | 1824458929508188160 |
---|---|
author | Philbin, J Marin-Jimenez, M Srinivasan, S Zisserman, A Jain, M Vempati, S Sankar, P Jawahar, CV |
author_facet | Philbin, J Marin-Jimenez, M Srinivasan, S Zisserman, A Jain, M Vempati, S Sankar, P Jawahar, CV |
author_sort | Philbin, J |
collection | OXFORD |
description | The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information.
The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information.
<br>For the high-level feature extraction task, we used two different approaches, both based on a combination of visual features. One used a SVM classifier using a linear combination of kernels, the other used a random forest classifier. For both methods, we trained all high-level features using publicly available annotations [3]. The advantage of the random forest classifier is the speed of training and testing.
<br>In addition, for the people feature, we took a more targeted approach. We used a real-time face detector and an upper body detector, in both cases running on every frame. Our best performing submission, C_OXVGG_1_1, which used a rank fusion of our random forest and SVM approach, achieved an mAP of 0.101 and was above the median for all but one feature.
<br>In the interactive search task, our team came third overall with an mAP of 0.158. The system used was identical to last year with the only change being a source of accurate upper body detections. |
first_indexed | 2025-02-19T04:33:42Z |
format | Conference item |
id | oxford-uuid:c2d78de7-b571-4f8d-8351-05b18b9b4809 |
institution | University of Oxford |
language | English |
last_indexed | 2025-02-19T04:33:42Z |
publishDate | 2008 |
publisher | National Institute of Standards and Technology |
record_format | dspace |
spelling | oxford-uuid:c2d78de7-b571-4f8d-8351-05b18b9b48092025-01-20T14:42:07ZOxford/IIIT TRECVID 2008 – notebook paperConference itemhttp://purl.org/coar/resource_type/c_5794uuid:c2d78de7-b571-4f8d-8351-05b18b9b4809EnglishSymplectic ElementsNational Institute of Standards and Technology2008Philbin, JMarin-Jimenez, MSrinivasan, SZisserman, AJain, MVempati, SSankar, PJawahar, CVThe Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. The Oxford/IIIT team participated in the high-level feature extraction and interactive search tasks. A vision only approach was used for both tasks, with no use of the text or audio information. <br>For the high-level feature extraction task, we used two different approaches, both based on a combination of visual features. One used a SVM classifier using a linear combination of kernels, the other used a random forest classifier. For both methods, we trained all high-level features using publicly available annotations [3]. The advantage of the random forest classifier is the speed of training and testing. <br>In addition, for the people feature, we took a more targeted approach. We used a real-time face detector and an upper body detector, in both cases running on every frame. Our best performing submission, C_OXVGG_1_1, which used a rank fusion of our random forest and SVM approach, achieved an mAP of 0.101 and was above the median for all but one feature. <br>In the interactive search task, our team came third overall with an mAP of 0.158. The system used was identical to last year with the only change being a source of accurate upper body detections. |
spellingShingle | Philbin, J Marin-Jimenez, M Srinivasan, S Zisserman, A Jain, M Vempati, S Sankar, P Jawahar, CV Oxford/IIIT TRECVID 2008 – notebook paper |
title | Oxford/IIIT TRECVID 2008 – notebook paper |
title_full | Oxford/IIIT TRECVID 2008 – notebook paper |
title_fullStr | Oxford/IIIT TRECVID 2008 – notebook paper |
title_full_unstemmed | Oxford/IIIT TRECVID 2008 – notebook paper |
title_short | Oxford/IIIT TRECVID 2008 – notebook paper |
title_sort | oxford iiit trecvid 2008 notebook paper |
work_keys_str_mv | AT philbinj oxfordiiittrecvid2008notebookpaper AT marinjimenezm oxfordiiittrecvid2008notebookpaper AT srinivasans oxfordiiittrecvid2008notebookpaper AT zissermana oxfordiiittrecvid2008notebookpaper AT jainm oxfordiiittrecvid2008notebookpaper AT vempatis oxfordiiittrecvid2008notebookpaper AT sankarp oxfordiiittrecvid2008notebookpaper AT jawaharcv oxfordiiittrecvid2008notebookpaper |