Summary: | The computer vision (CV) is an emerging area with sundry promises. This communication encompasses the past development, recent trends and future directions of the CV in the context of deep learning (DL) algorithms-based object detections and localizations techniques. To identify the object location inside an image and recognize it by a computer program as fast as the human brain the machine learning and DL techniques have been evolved. However, the main limitations of the machine are related to the prolonged time consumption to handle vast amount of data to perform the same task as the human brain. To overcome these shortcomings, the convolution neural networks (NNs)-based deep NN has been developed, which detects and classifies the object with high precision. To train the deep NNs, massive amount of data (in the form of images and videos) and time is needed, making the computational cost of the CV very high. Thus, transfer learning techniques have been proposed wherein a model trained on one task can be reused on another linked task, thereby producing excellent outcomes. In this spirit, diverse DL-based algorithms have been introduced to detect and classify the object. These algorithms include the region-based convolutional NN (R-CNN), fast R-CNN, Faster R-CNN, mask E-CNN and You Only Look Once. A comparative evaluation among these techniques has been made to reveal their merits and demerits in the CV.
|