Summary: | Semantic image segmentation is a problem of simultaneous segmentation and recognition of an input image into regions and their associated categorical labels, such as person, car or cow. A popular way to achieve this goal is to assign a label to every pixel in the input image and impose simple structural constraints on the output label space. Efficient approximation algorithms for solving this labelling problem such as a-expansion have, at best, linear runtime complexity with respect to the number of labels, making them practical only when working in a specific domain that has few classes-of-interest. However when working in a more general setting where the number of classes could easily reach tens of thousands, sub-linear complexity is desired. In this paper we propose meeting this requirement by performing cascaded inference that wraps around the a-expansion algorithm. The cascade both divides the large label set into smaller more manageable ones by way of a hierarchy, and dynamically subdivides the image into smaller and smaller regions during inference. We test our method on the SUN09 dataset with 107 accurately hand labelled classes.
|