Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset
Background: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvem...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Kerman University of Medical Sciences
2019-05-01
|
Series: | Journal of Kerman University of Medical Sciences |
Subjects: | |
Online Access: | https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdf |
_version_ | 1797790777573638144 |
---|---|
author | Hadi Raeisi shahraki Shiva Kalantari Mohsen Nafar |
author_facet | Hadi Raeisi shahraki Shiva Kalantari Mohsen Nafar |
author_sort | Hadi Raeisi shahraki |
collection | DOAJ |
description | Background: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvement of CKD management. In metabolomics studies, not only prediction accuracy is attractive, but also variable importance is critical because the identified biomarkers reveal pathogenic metabolic processes underlying the progression of chronic kidney disease. We aimed to use k-important neighbors (KIN), for the analysis of a high dimensional metabolomics dataset to classify patients into mild or advanced progression of CKD. Methods: Urine samples were collected from CKD patients (n=73). The patients were classified based on metabolite biomarkers into the two groups: mild CKD (glomerular filtration rate (GFR)> 60 mL/min per 1·73 m2) and advanced CKD (GFR2). Accordingly, 48 and 25 patients were in mild (class 1) and advanced (class 2) groups respectively. Recently, KIN was proposed as a novel approach to high dimensional binary classification settings. Through employing a hybrid dissimilarity measure in KIN, it is possible to incorporate information of variables and distances simultaneously. Results: The proposed KIN not only selected a few number of biomarkers, it also reached a higher accuracy compared to traditional k-nearest neighbors (61.2% versus 60.4%) and random forest (61.2% versus 58.5%) which are currently known as the best classifieres. Conclusion: Real metabolomics dataset demonstrate the superiority of proposed KIN versus KNN in terms of both classification accuracy and variable importance. |
first_indexed | 2024-03-13T02:09:22Z |
format | Article |
id | doaj.art-aa3d0b0ef2ef4b32a02f39ba0d70ca7f |
institution | Directory Open Access Journal |
issn | 2008-2843 |
language | English |
last_indexed | 2024-03-13T02:09:22Z |
publishDate | 2019-05-01 |
publisher | Kerman University of Medical Sciences |
record_format | Article |
series | Journal of Kerman University of Medical Sciences |
spelling | doaj.art-aa3d0b0ef2ef4b32a02f39ba0d70ca7f2023-07-01T05:16:19ZengKerman University of Medical SciencesJournal of Kerman University of Medical Sciences2008-28432019-05-0126320721310.22062/jkmu.2019.8950189501Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics DatasetHadi Raeisi shahraki0Shiva Kalantari1Mohsen Nafar2Assistant Professor, Department of Biostatistics and Epidemiology, Faculty of Health, Shahrekord University of Medical Sciences, Shahrekord, IranAssistant Professor, Chronic Kidney Disease Research Center, Labbafinejad Hospital, Shahid Beheshti University of Medical Sciences, Tehran, IranProfessor, Urology and Nephrology Research Center, Labafinejad Hospital, Shahid Beheshti University of Medical Sciences, Tehran, IranBackground: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvement of CKD management. In metabolomics studies, not only prediction accuracy is attractive, but also variable importance is critical because the identified biomarkers reveal pathogenic metabolic processes underlying the progression of chronic kidney disease. We aimed to use k-important neighbors (KIN), for the analysis of a high dimensional metabolomics dataset to classify patients into mild or advanced progression of CKD. Methods: Urine samples were collected from CKD patients (n=73). The patients were classified based on metabolite biomarkers into the two groups: mild CKD (glomerular filtration rate (GFR)> 60 mL/min per 1·73 m2) and advanced CKD (GFR2). Accordingly, 48 and 25 patients were in mild (class 1) and advanced (class 2) groups respectively. Recently, KIN was proposed as a novel approach to high dimensional binary classification settings. Through employing a hybrid dissimilarity measure in KIN, it is possible to incorporate information of variables and distances simultaneously. Results: The proposed KIN not only selected a few number of biomarkers, it also reached a higher accuracy compared to traditional k-nearest neighbors (61.2% versus 60.4%) and random forest (61.2% versus 58.5%) which are currently known as the best classifieres. Conclusion: Real metabolomics dataset demonstrate the superiority of proposed KIN versus KNN in terms of both classification accuracy and variable importance.https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdfchronic kidney diseaseclassificationhigh dimensional dataknnscad |
spellingShingle | Hadi Raeisi shahraki Shiva Kalantari Mohsen Nafar Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset Journal of Kerman University of Medical Sciences chronic kidney disease classification high dimensional data knn scad |
title | Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset |
title_full | Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset |
title_fullStr | Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset |
title_full_unstemmed | Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset |
title_short | Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset |
title_sort | classification of chronic kidney disease patients via k important neighbors in high dimensional metabolomics dataset |
topic | chronic kidney disease classification high dimensional data knn scad |
url | https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdf |
work_keys_str_mv | AT hadiraeisishahraki classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset AT shivakalantari classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset AT mohsennafar classificationofchronickidneydiseasepatientsviakimportantneighborsinhighdimensionalmetabolomicsdataset |