Summary: | Video summarization (VS) is to identify important content from a given video, which can help users quickly comprehend video content. Recently, sparse dictionary selection (SDS) has demonstrated to be an effective solution for VS problems, which generally assumes a linear relationship between keyframes and non-keyframes. However, this assumption is not always true for video frames which possess intrinsic nonlinear structures and properties. In this paper, by exploiting the nonlinearity between video frames, a nonlinear SDS model is formulated for VS, in which the nonlinearity is transformed to linearity by projecting a video to a high-dimensional feature space induced by a kernel function. We also propose two greedy optimization algorithms to solve the resulting model, namely the standard kernel SDS (KSDS) greedy algorithm and the robust KSDS greedy algorithm with a backtracking strategy. In order to achieve an intuitive and flexible configuration of the VS process, an adaptive criterion, namely energy ratio, is devised to produce video summaries with different lengths for different video contents. Experimental results on two different benchmark video datasets demonstrate that the proposed algorithm outperforms several state-of-the-art VS algorithms.
|