Summary: | Detecting bias in data is crucial since it can pose serious problems when developing an AI algorithm. The research aims to propose a novel study design to detect bias in image classification data by using pretrained Convolutional Neural Network (CNN) layers as a feature extractor. There are three datasets used in the research with varying degrees of complexity, those are low, medium, and high complexity. There are Modified National Institute of Standards and Technology (MNIST) Digits, batik collections (Parang, Megamendung, and Kawung), and Canadian Institute for Advanced Research (CIFAR-10) datasets. Then, the researchers make a baseline workflow and substitute a step-in feature extraction with a convolution using the first pre-trained CNN layer and each of its kernels. Then, the researchers evaluate the effect of the experiments using accuracy. By observing the effect of the individual kernel, the research can better make sense of what happens inside a CNN layer. The research finds that color in the image is an essential factor when working with CNN. Furthermore, the proposed study design can detect bias in image classification data where it is related to the color of the image. Detecting this bias early is important in helping developers to improve AI algorithms.
|