Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset

Background: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvem...

Full description

Bibliographic Details
Main Authors: Hadi Raeisi shahraki, Shiva Kalantari, Mohsen Nafar
Format: Article
Language:English
Published: Kerman University of Medical Sciences 2019-05-01
Series:Journal of Kerman University of Medical Sciences
Subjects:
Online Access:https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdf
_version_ 1797790777573638144
author Hadi Raeisi shahraki
Shiva Kalantari
Mohsen Nafar
author_facet Hadi Raeisi shahraki
Shiva Kalantari
Mohsen Nafar
author_sort Hadi Raeisi shahraki
collection DOAJ
description Background: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvement of CKD management. In metabolomics studies, not only prediction accuracy is attractive, but also variable importance is critical because the identified biomarkers reveal pathogenic metabolic processes underlying the progression of chronic kidney disease. We aimed to use k-important neighbors (KIN), for the analysis of a high dimensional metabolomics dataset to classify patients into mild or advanced progression of CKD. Methods: Urine samples were collected from CKD patients (n=73). The patients were classified based on metabolite biomarkers into the two groups: mild CKD (glomerular filtration rate (GFR)> 60 mL/min per 1·73 m2) and advanced CKD (GFR2). Accordingly, 48 and 25 patients were in mild (class 1) and advanced (class 2) groups respectively. Recently, KIN was proposed as a novel approach to high dimensional binary classification settings. Through employing a hybrid dissimilarity measure in KIN, it is possible to incorporate information of variables and distances simultaneously. Results: The proposed KIN not only selected a few number of biomarkers, it also reached a higher accuracy compared to traditional k-nearest neighbors (61.2% versus 60.4%) and random forest (61.2% versus 58.5%) which are currently known as the best classifieres. Conclusion: Real metabolomics dataset demonstrate the superiority of proposed KIN versus KNN in terms of both classification accuracy and variable importance.
first_indexed 2024-03-13T02:09:22Z
format Article
id doaj.art-aa3d0b0ef2ef4b32a02f39ba0d70ca7f
institution Directory Open Access Journal
issn 2008-2843
language English
last_indexed 2024-03-13T02:09:22Z
publishDate 2019-05-01
publisher Kerman University of Medical Sciences
record_format Article
series Journal of Kerman University of Medical Sciences
spelling doaj.art-aa3d0b0ef2ef4b32a02f39ba0d70ca7f2023-07-01T05:16:19ZengKerman University of Medical SciencesJournal of Kerman University of Medical Sciences2008-28432019-05-0126320721310.22062/jkmu.2019.8950189501Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics DatasetHadi Raeisi shahraki0Shiva Kalantari1Mohsen Nafar2Assistant Professor, Department of Biostatistics and Epidemiology, Faculty of Health, Shahrekord University of Medical Sciences, Shahrekord, IranAssistant Professor, Chronic Kidney Disease Research Center, Labbafinejad Hospital, Shahid Beheshti University of Medical Sciences, Tehran, IranProfessor, Urology and Nephrology Research Center, Labafinejad Hospital, Shahid Beheshti University of Medical Sciences, Tehran, IranBackground: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvement of CKD management. In metabolomics studies, not only prediction accuracy is attractive, but also variable importance is critical because the identified biomarkers reveal pathogenic metabolic processes underlying the progression of chronic kidney disease. We aimed to use k-important neighbors (KIN), for the analysis of a high dimensional metabolomics dataset to classify patients into mild or advanced progression of CKD. Methods: Urine samples were collected from CKD patients (n=73). The patients were classified based on metabolite biomarkers into the two groups: mild CKD (glomerular filtration rate (GFR)> 60 mL/min per 1·73 m2) and advanced CKD (GFR2). Accordingly, 48 and 25 patients were in mild (class 1) and advanced (class 2) groups respectively. Recently, KIN was proposed as a novel approach to high dimensional binary classification settings. Through employing a hybrid dissimilarity measure in KIN, it is possible to incorporate information of variables and distances simultaneously. Results: The proposed KIN not only selected a few number of biomarkers, it also reached a higher accuracy compared to traditional k-nearest neighbors (61.2% versus 60.4%) and random forest (61.2% versus 58.5%) which are currently known as the best classifieres. Conclusion: Real metabolomics dataset demonstrate the superiority of proposed KIN versus KNN in terms of both classification accuracy and variable importance.https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdfchronic kidney diseaseclassificationhigh dimensional dataknnscad
spellingShingle Hadi Raeisi shahraki
Shiva Kalantari
Mohsen Nafar
Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
Journal of Kerman University of Medical Sciences
chronic kidney disease
classification
high dimensional data
knn
scad
title Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
title_full Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
title_fullStr Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
title_full_unstemmed Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
title_short Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
title_sort classification of chronic kidney disease patients via k important neighbors in high dimensional metabolomics dataset
topic chronic kidney disease
classification
high dimensional data
knn
scad
url https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdf
work_keys_str_mv AT hadiraeisishahraki classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset
AT shivakalantari classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset
AT mohsennafar classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset