Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection
Breast cancer is one of the most widespread diseases in women worldwide. It leads to the second-largest mortality rate in women, especially in European countries. It occurs when malignant lumps that are cancerous start to grow in the breast cells. Accurate and early diagnosis can help in increasing...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Diagnostics |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-4418/12/5/1134 |
_version_ | 1827669427846381568 |
---|---|
author | Sadia Safdar Muhammad Rizwan Thippa Reddy Gadekallu Abdul Rehman Javed Mohammad Khalid Imam Rahmani Khurram Jawad Surbhi Bhatia |
author_facet | Sadia Safdar Muhammad Rizwan Thippa Reddy Gadekallu Abdul Rehman Javed Mohammad Khalid Imam Rahmani Khurram Jawad Surbhi Bhatia |
author_sort | Sadia Safdar |
collection | DOAJ |
description | Breast cancer is one of the most widespread diseases in women worldwide. It leads to the second-largest mortality rate in women, especially in European countries. It occurs when malignant lumps that are cancerous start to grow in the breast cells. Accurate and early diagnosis can help in increasing survival rates against this disease. A computer-aided detection (CAD) system is necessary for radiologists to differentiate between normal and abnormal cell growth. This research consists of two parts; the first part involves a brief overview of the different image modalities, using a wide range of research databases to source information such as ultrasound, histography, and mammography to access various publications. The second part evaluates different machine learning techniques used to estimate breast cancer recurrence rates. The first step is to perform preprocessing, including eliminating missing values, data noise, and transformation. The dataset is divided as follows: 60% of the dataset is used for training, and the rest, 40%, is used for testing. We focus on minimizing type one false-positive rate (FPR) and type two false-negative rate (FNR) errors to improve accuracy and sensitivity. Our proposed model uses machine learning techniques such as support vector machine (SVM), logistic regression (LR), and K-nearest neighbor (KNN) to achieve better accuracy in breast cancer classification. Furthermore, we attain the highest accuracy of 97.7% with 0.01 FPR, 0.03 FNR, and an area under the ROC curve (AUC) score of 0.99. The results show that our proposed model successfully classifies breast tumors while overcoming previous research limitations. Finally, we summarize the paper with the future trends and challenges of the classification and segmentation in breast cancer detection. |
first_indexed | 2024-03-10T03:03:07Z |
format | Article |
id | doaj.art-aa97bdd16f4547a7b50e52cefe468000 |
institution | Directory Open Access Journal |
issn | 2075-4418 |
language | English |
last_indexed | 2024-03-10T03:03:07Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Diagnostics |
spelling | doaj.art-aa97bdd16f4547a7b50e52cefe4680002023-11-23T10:39:51ZengMDPI AGDiagnostics2075-44182022-05-01125113410.3390/diagnostics12051134Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer DetectionSadia Safdar0Muhammad Rizwan1Thippa Reddy Gadekallu2Abdul Rehman Javed3Mohammad Khalid Imam Rahmani4Khurram Jawad5Surbhi Bhatia6Department of Computer Science, Kinnaird College for Women, Lahore 44000, PakistanDepartment of Computer Science, Kinnaird College for Women, Lahore 44000, PakistanSchool of Information Technology and Engineering, Vellore Institute of Technology, Vellore 632014, IndiaDepartment of Cyber Security, Air University, Islamabad 44000, PakistanCollege of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi ArabiaCollege of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi ArabiaDepartment of Information Systems, College of Computer Science & Information Technology, King Faisal University, Hofuf 31982, Saudi ArabiaBreast cancer is one of the most widespread diseases in women worldwide. It leads to the second-largest mortality rate in women, especially in European countries. It occurs when malignant lumps that are cancerous start to grow in the breast cells. Accurate and early diagnosis can help in increasing survival rates against this disease. A computer-aided detection (CAD) system is necessary for radiologists to differentiate between normal and abnormal cell growth. This research consists of two parts; the first part involves a brief overview of the different image modalities, using a wide range of research databases to source information such as ultrasound, histography, and mammography to access various publications. The second part evaluates different machine learning techniques used to estimate breast cancer recurrence rates. The first step is to perform preprocessing, including eliminating missing values, data noise, and transformation. The dataset is divided as follows: 60% of the dataset is used for training, and the rest, 40%, is used for testing. We focus on minimizing type one false-positive rate (FPR) and type two false-negative rate (FNR) errors to improve accuracy and sensitivity. Our proposed model uses machine learning techniques such as support vector machine (SVM), logistic regression (LR), and K-nearest neighbor (KNN) to achieve better accuracy in breast cancer classification. Furthermore, we attain the highest accuracy of 97.7% with 0.01 FPR, 0.03 FNR, and an area under the ROC curve (AUC) score of 0.99. The results show that our proposed model successfully classifies breast tumors while overcoming previous research limitations. Finally, we summarize the paper with the future trends and challenges of the classification and segmentation in breast cancer detection.https://www.mdpi.com/2075-4418/12/5/1134breast cancercomputer-aided detection (CAD)support vector machine (SVM)K-nearest neighbor (KNN)machine learningdeep learning |
spellingShingle | Sadia Safdar Muhammad Rizwan Thippa Reddy Gadekallu Abdul Rehman Javed Mohammad Khalid Imam Rahmani Khurram Jawad Surbhi Bhatia Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection Diagnostics breast cancer computer-aided detection (CAD) support vector machine (SVM) K-nearest neighbor (KNN) machine learning deep learning |
title | Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection |
title_full | Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection |
title_fullStr | Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection |
title_full_unstemmed | Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection |
title_short | Bio-Imaging-Based Machine Learning Algorithm for Breast Cancer Detection |
title_sort | bio imaging based machine learning algorithm for breast cancer detection |
topic | breast cancer computer-aided detection (CAD) support vector machine (SVM) K-nearest neighbor (KNN) machine learning deep learning |
url | https://www.mdpi.com/2075-4418/12/5/1134 |
work_keys_str_mv | AT sadiasafdar bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT muhammadrizwan bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT thippareddygadekallu bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT abdulrehmanjaved bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT mohammadkhalidimamrahmani bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT khurramjawad bioimagingbasedmachinelearningalgorithmforbreastcancerdetection AT surbhibhatia bioimagingbasedmachinelearningalgorithmforbreastcancerdetection |