Lifelong Machine Learning for Regional-Based Image Classification in Open Datasets

Deep Learning algorithms are becoming common in solving different supervised and unsupervised learning problems. Different deep learning algorithms were developed in last decade to solve different learning problems in different domains such as computer vision, speech recognition, machine translation...

Full description

Bibliographic Details
Main Authors: Hashem Alyami, Abdullah Alharbi, Irfan Uddin
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/12/12/2094
Description
Summary:Deep Learning algorithms are becoming common in solving different supervised and unsupervised learning problems. Different deep learning algorithms were developed in last decade to solve different learning problems in different domains such as computer vision, speech recognition, machine translation, etc. In the research field of computer vision, it is observed that deep learning has become overwhelmingly popular. In solving computer vision related problems, we first take a CNN (Convolutional Neural Network) which is trained from scratch or some times a pre-trained model is taken and further fine-tuned based on the dataset that is available. The problem of training the model from scratch on new datasets suffers from <i>catastrophic forgetting</i>. Which means that when a new dataset is used to train the model, it forgets the knowledge it has obtained from an existing dataset. In other words different datasets does not help the model to increase its knowledge. The problem with the pre-trained models is that mostly CNN models are trained on open datasets, where the data set contains instances from specific regions. This results into predicting disturbing labels when the same model is used for instances of datasets collected in a different region. Therefore, there is a need to find a solution on how to reduce the gap of Geo-diversity in different computer vision problems in developing world. In this paper, we explore the problems of models that were trained from scratch along with models which are pre-trained on a large dataset, using a dataset specifically developed to understand the geo-diversity issues in open datasets. The dataset contains images of different wedding scenarios in South Asian countries. We developed a Lifelong CNN that can incrementally increase knowledge i.e., the CNN learns labels from the new dataset but includes the existing knowledge of open data sets. The proposed model demonstrates highest accuracy compared to models trained from scratch or pre-trained model.
ISSN:2073-8994