MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.

BACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurat...

Full description

Bibliographic Details
Main Authors: Cheng-Hong Yang, Yu-Da Lin, Li-Yeh Chuang, Jin-Bor Chen, Hsueh-Wei Chang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3827354?pdf=render
_version_ 1811274036824506368
author Cheng-Hong Yang
Yu-Da Lin
Li-Yeh Chuang
Jin-Bor Chen
Hsueh-Wei Chang
author_facet Cheng-Hong Yang
Yu-Da Lin
Li-Yeh Chuang
Jin-Bor Chen
Hsueh-Wei Chang
author_sort Cheng-Hong Yang
collection DOAJ
description BACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. METHODOLOGY/PRINCIPAL FINDINGS: To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1:4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. CONCLUSIONS/SIGNIFICANCE: Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets.
first_indexed 2024-04-12T23:10:34Z
format Article
id doaj.art-72993a178a554244a12d66dfeaa9c7ad
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-12T23:10:34Z
publishDate 2013-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-72993a178a554244a12d66dfeaa9c7ad2022-12-22T03:12:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-01811e7938710.1371/journal.pone.0079387MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.Cheng-Hong YangYu-Da LinLi-Yeh ChuangJin-Bor ChenHsueh-Wei ChangBACKGROUND: Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. METHODOLOGY/PRINCIPAL FINDINGS: To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1:4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. CONCLUSIONS/SIGNIFICANCE: Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets.http://europepmc.org/articles/PMC3827354?pdf=render
spellingShingle Cheng-Hong Yang
Yu-Da Lin
Li-Yeh Chuang
Jin-Bor Chen
Hsueh-Wei Chang
MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
PLoS ONE
title MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
title_full MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
title_fullStr MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
title_full_unstemmed MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
title_short MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction.
title_sort mdr er balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor dimensionality reduction
url http://europepmc.org/articles/PMC3827354?pdf=render
work_keys_str_mv AT chenghongyang mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction
AT yudalin mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction
AT liyehchuang mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction
AT jinborchen mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction
AT hsuehweichang mdrerbalancingfunctionsforadjustingtheratioinriskclassesandclassificationerrorsforimbalancedcasesandcontrolsusingmultifactordimensionalityreduction