A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace

Abstract Background Convolutional neural network-based image processing research is actively being conducted for pathology image analysis. As a convolutional neural network model requires a large amount of image data for training, active learning (AL) has been developed to produce efficient learning...

Full description

Bibliographic Details
Main Authors:	Mujin Kim, Willmer Rafell Quiñones Robles, Young Sin Ko, Bryan Wong, Sol Lee, Mun Yong Yi
Format:	Article
Language:	English
Published:	BMC 2024-01-01
Series:	BMC Medical Imaging
Subjects:	Active learning strategy Noisy data Cancer pathology images Convolutional neural networks Deep learning Histopathology image analysis
Online Access:	https://doi.org/10.1186/s12880-023-01170-8

_version_	1797362765941178368
author	Mujin Kim Willmer Rafell Quiñones Robles Young Sin Ko Bryan Wong Sol Lee Mun Yong Yi
author_facet	Mujin Kim Willmer Rafell Quiñones Robles Young Sin Ko Bryan Wong Sol Lee Mun Yong Yi
author_sort	Mujin Kim
collection	DOAJ
description	Abstract Background Convolutional neural network-based image processing research is actively being conducted for pathology image analysis. As a convolutional neural network model requires a large amount of image data for training, active learning (AL) has been developed to produce efficient learning with a small amount of training data. However, existing studies have not specifically considered the characteristics of pathological data collected from the workplace. For various reasons, noisy patches can be selected instead of clean patches during AL, thereby reducing its efficiency. This study proposes an effective AL method for cancer pathology that works robustly on noisy datasets. Methods Our proposed method to develop a robust AL approach for noisy histopathology datasets consists of the following three steps: 1) training a loss prediction module, 2) collecting predicted loss values, and 3) sampling data for labeling. This proposed method calculates the amount of information in unlabeled data as predicted loss values and removes noisy data based on predicted loss values to reduce the rate at which noisy data are selected from the unlabeled dataset. We identified a suitable threshold for optimizing the efficiency of AL through sensitivity analysis. Results We compared the results obtained with the identified threshold with those of existing representative AL methods. In the final iteration, the proposed method achieved a performance of 91.7% on the noisy dataset and 92.4% on the clean dataset, resulting in a performance reduction of less than 1%. Concomitantly, the noise selection ratio averaged only 2.93% on each iteration. Conclusions The proposed AL method showed robust performance on datasets containing noisy data by avoiding data selection in predictive loss intervals where noisy data are likely to be distributed. The proposed method contributes to medical image analysis by screening data and producing a robust and effective classification model tailored for cancer pathology image processing in the workplace.
first_indexed	2024-03-08T16:11:12Z
format	Article
id	doaj.art-3d6ed6f9465845ff8438ece192681dde
institution	Directory Open Access Journal
issn	1471-2342
language	English
last_indexed	2024-03-08T16:11:12Z
publishDate	2024-01-01
publisher	BMC
record_format	Article
series	BMC Medical Imaging
spelling	doaj.art-3d6ed6f9465845ff8438ece192681dde2024-01-07T12:54:18ZengBMCBMC Medical Imaging1471-23422024-01-0124111810.1186/s12880-023-01170-8A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplaceMujin Kim0Willmer Rafell Quiñones Robles1Young Sin Ko2Bryan Wong3Sol Lee4Mun Yong Yi5Graduate School of Data Science, Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and TechnologyGraduate School of Data Science, Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and TechnologyPathology Center, Seegene Medical FoundationGraduate School of Data Science, Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and TechnologyGraduate School of Data Science, Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and TechnologyGraduate School of Data Science, Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and TechnologyAbstract Background Convolutional neural network-based image processing research is actively being conducted for pathology image analysis. As a convolutional neural network model requires a large amount of image data for training, active learning (AL) has been developed to produce efficient learning with a small amount of training data. However, existing studies have not specifically considered the characteristics of pathological data collected from the workplace. For various reasons, noisy patches can be selected instead of clean patches during AL, thereby reducing its efficiency. This study proposes an effective AL method for cancer pathology that works robustly on noisy datasets. Methods Our proposed method to develop a robust AL approach for noisy histopathology datasets consists of the following three steps: 1) training a loss prediction module, 2) collecting predicted loss values, and 3) sampling data for labeling. This proposed method calculates the amount of information in unlabeled data as predicted loss values and removes noisy data based on predicted loss values to reduce the rate at which noisy data are selected from the unlabeled dataset. We identified a suitable threshold for optimizing the efficiency of AL through sensitivity analysis. Results We compared the results obtained with the identified threshold with those of existing representative AL methods. In the final iteration, the proposed method achieved a performance of 91.7% on the noisy dataset and 92.4% on the clean dataset, resulting in a performance reduction of less than 1%. Concomitantly, the noise selection ratio averaged only 2.93% on each iteration. Conclusions The proposed AL method showed robust performance on datasets containing noisy data by avoiding data selection in predictive loss intervals where noisy data are likely to be distributed. The proposed method contributes to medical image analysis by screening data and producing a robust and effective classification model tailored for cancer pathology image processing in the workplace.https://doi.org/10.1186/s12880-023-01170-8Active learning strategyNoisy dataCancer pathology imagesConvolutional neural networksDeep learningHistopathology image analysis
spellingShingle	Mujin Kim Willmer Rafell Quiñones Robles Young Sin Ko Bryan Wong Sol Lee Mun Yong Yi A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace BMC Medical Imaging Active learning strategy Noisy data Cancer pathology images Convolutional neural networks Deep learning Histopathology image analysis
title	A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace
title_full	A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace
title_fullStr	A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace
title_full_unstemmed	A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace
title_short	A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace
title_sort	predicted loss based active learning approach for robust cancer pathology image analysis in the workplace
topic	Active learning strategy Noisy data Cancer pathology images Convolutional neural networks Deep learning Histopathology image analysis
url	https://doi.org/10.1186/s12880-023-01170-8
work_keys_str_mv	AT mujinkim apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT willmerrafellquinonesrobles apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT youngsinko apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT bryanwong apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT sollee apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT munyongyi apredictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT mujinkim predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT willmerrafellquinonesrobles predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT youngsinko predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT bryanwong predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT sollee predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace AT munyongyi predictedlossbasedactivelearningapproachforrobustcancerpathologyimageanalysisintheworkplace

A predicted-loss based active learning approach for robust cancer pathology image analysis in the workplace

Similar Items