Machine Learning based on Probabilistic Models Applied to Medical Data: The Case of Prostate Cancer

The growth in the amount of data in companies puts analysts in difficulties when extracting hidden knowledge from data. Several models have emerged that focus on the notion of distances while ignoring the notion of conditional probability density. This research study focuses on segmentation using mi...

Full description

Bibliographic Details
Main Authors: Anaclet Tshikutu Bikengela, Remy Mutapay Tshimona, Pierre Kafunda Katalay, Simon Ntumba Badibanga, Eugène Mbuyi Mukendi
Format: Article
Language:English
Published: Pusat Penelitian dan Pengabdian Masyarakat (P3M), Politeknik Negeri Cilacap 2023-12-01
Series:Journal of Innovation Information Technology and Application
Subjects:
Online Access:https://ejournal.pnc.ac.id/index.php/jinita/article/view/1879
Description
Summary:The growth in the amount of data in companies puts analysts in difficulties when extracting hidden knowledge from data. Several models have emerged that focus on the notion of distances while ignoring the notion of conditional probability density. This research study focuses on segmentation using mixture models and Bayesian networks for medical data mining. As enterprise data becomes large, there is a way to apply data mining methods to make sense of it using classification methods. We designed different models with different architectures and then applied these models to the medical database. The algorithms were implemented for the real data. The objective is to classify individuals according to the conditional probability density of random variables, in addition to identifying causalities between traits from tests of conditional independence and a correlation measure, both based on χ2. After a quick illustration of several models (decision tree, SVM, K-means, Bayes), we applied our method to data from an epidemiological study (done at the University of Kinshasa University clinics) of case-control of prostate cancer. Thus, we found after interpretation of the results followed by discussion that our model allows us to classify a new individual with an accuracy of 96%.
ISSN:2716-0858
2715-9248