Enhancing SVM for survival data using local invariances and weighting

Abstract Background The necessity to analyze medium-throughput data in epidemiological studies with small sample size, particularly when studying biomedical data may hinder the use of classical statistical methods. Support vector machines (SVM) models can be successfully applied in this setting beca...

Full description

Bibliographic Details
Main Authors: Hector Sanz, Ferran Reverter, Clarissa Valim
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-3481-2
_version_ 1818524451408445440
author Hector Sanz
Ferran Reverter
Clarissa Valim
author_facet Hector Sanz
Ferran Reverter
Clarissa Valim
author_sort Hector Sanz
collection DOAJ
description Abstract Background The necessity to analyze medium-throughput data in epidemiological studies with small sample size, particularly when studying biomedical data may hinder the use of classical statistical methods. Support vector machines (SVM) models can be successfully applied in this setting because they are a powerful tool to analyze data with large number of predictors and limited sample size, especially when handling binary outcomes. However, biomedical research often involves analysis of time-to-event outcomes and has to account for censoring. Methods to handle censored data in the SVM framework can be divided into two classes: those based on support vector regression (SVR) and those based on binary classification. Methods based on SVR seem to be suboptimal to handle sparse data and yield results comparable to Cox proportional hazards model and kernel Cox regression. The limited work dedicated to assess methods based on of SVM for binary classification has been based on SVM learning using privileged information and SVM with uncertain classes. Results This paper proposes alternative methods and extensions within the binary classification framework, specifically, a conditional survival approach for weighting censored observations and a semi-supervised SVM with local invariances. Using simulation studies and some real datasets, we evaluate those two methods and compare them with a weighted SVM model, SVM extensions found in the literature, kernel Cox regression and Cox model. Conclusions Our proposed methods perform generally better under a wide variety of realistic scenarios about the structure of biomedical data. Specifically, the local invariances method using the conditional survival approach is the most robust method under different scenarios and is a good approach to consider as an alternative to other time-to-event methods. When analysing real data is a method to be considered and recommended since outperforms other methods in proportional and non-proportional scenarios and sparse data, which is something usual in biomedical data and biomarkers analysis.
first_indexed 2024-12-11T05:57:15Z
format Article
id doaj.art-5644982f07cd4e299140a0714d5bc3fb
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T05:57:15Z
publishDate 2020-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-5644982f07cd4e299140a0714d5bc3fb2022-12-22T01:18:37ZengBMCBMC Bioinformatics1471-21052020-05-0121112010.1186/s12859-020-3481-2Enhancing SVM for survival data using local invariances and weightingHector Sanz0Ferran Reverter1Clarissa Valim2Department of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de BarcelonaDepartment of Genetics, Microbiology and Statistics, Faculty of Biology, Universitat de BarcelonaDepartment of Global Health, Boston UniversityAbstract Background The necessity to analyze medium-throughput data in epidemiological studies with small sample size, particularly when studying biomedical data may hinder the use of classical statistical methods. Support vector machines (SVM) models can be successfully applied in this setting because they are a powerful tool to analyze data with large number of predictors and limited sample size, especially when handling binary outcomes. However, biomedical research often involves analysis of time-to-event outcomes and has to account for censoring. Methods to handle censored data in the SVM framework can be divided into two classes: those based on support vector regression (SVR) and those based on binary classification. Methods based on SVR seem to be suboptimal to handle sparse data and yield results comparable to Cox proportional hazards model and kernel Cox regression. The limited work dedicated to assess methods based on of SVM for binary classification has been based on SVM learning using privileged information and SVM with uncertain classes. Results This paper proposes alternative methods and extensions within the binary classification framework, specifically, a conditional survival approach for weighting censored observations and a semi-supervised SVM with local invariances. Using simulation studies and some real datasets, we evaluate those two methods and compare them with a weighted SVM model, SVM extensions found in the literature, kernel Cox regression and Cox model. Conclusions Our proposed methods perform generally better under a wide variety of realistic scenarios about the structure of biomedical data. Specifically, the local invariances method using the conditional survival approach is the most robust method under different scenarios and is a good approach to consider as an alternative to other time-to-event methods. When analysing real data is a method to be considered and recommended since outperforms other methods in proportional and non-proportional scenarios and sparse data, which is something usual in biomedical data and biomarkers analysis.http://link.springer.com/article/10.1186/s12859-020-3481-2Support vector machinesSurvival analysisKernelClassification
spellingShingle Hector Sanz
Ferran Reverter
Clarissa Valim
Enhancing SVM for survival data using local invariances and weighting
BMC Bioinformatics
Support vector machines
Survival analysis
Kernel
Classification
title Enhancing SVM for survival data using local invariances and weighting
title_full Enhancing SVM for survival data using local invariances and weighting
title_fullStr Enhancing SVM for survival data using local invariances and weighting
title_full_unstemmed Enhancing SVM for survival data using local invariances and weighting
title_short Enhancing SVM for survival data using local invariances and weighting
title_sort enhancing svm for survival data using local invariances and weighting
topic Support vector machines
Survival analysis
Kernel
Classification
url http://link.springer.com/article/10.1186/s12859-020-3481-2
work_keys_str_mv AT hectorsanz enhancingsvmforsurvivaldatausinglocalinvariancesandweighting
AT ferranreverter enhancingsvmforsurvivaldatausinglocalinvariancesandweighting
AT clarissavalim enhancingsvmforsurvivaldatausinglocalinvariancesandweighting