Summary: | The process of synthesizing an image with objects in new pose angles other than input pose is known as Novel View Synthesis. Humans can visualize the objects in new poses by imagination. This paper proposes a novel deep learning based solution named SynthNet for the same task. This paper identifies the research gap that none of the existing deep learning models have employed depthwise separable convolution in their design for the application of Novel View Synthesis. SynthNet utilizes depthwise separable convolution in its design with a goal to reduce trainable parameter count and make the model computationally efficient. Identity gated skip connection, attention gated skip connection and flow attention gated skip connection are designed and employed to mitigate the vanishing gradient problem. A total of eight variations of SynthNet are designed, built, trained and tested. SynthNet is trained for car and chair class of objects using ShapeNet dataset. The trained model is successful in generating novel view of the object capturing the desired target view angle for a given input image with an unspecified view angle. Performance metric used for evaluating SynthNet are MAE and SSIM. The performance is on par or better than the baseline papers when compared with different variants of SynthNet. Notable achievement of the proposed design is reduction in trainable parameter count which indicates reduction in computational steps. SynthNet achieved 40% reduction in terms of trainable parameter count when compared with existing models due to application of depthwise separable convolution. Therefore, in simple words, SynthNet has achieved better performance than its peers simultaneously achieving computational efficiency.
|