Enhancing SVM for survival data using local invariances and weighting
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Sanz, Hector
- dc.contributor.author Reverter, Ferran
- dc.contributor.author Valim, Clarissa
- dc.date.accessioned 2020-10-08T06:20:59Z
- dc.date.available 2020-10-08T06:20:59Z
- dc.date.issued 2020
- dc.description.abstract Background: The necessity to analyze medium-throughput data in epidemiological studies with small sample size, particularly when studying biomedical data may hinder the use of classical statistical methods. Support vector machines (SVM) models can be successfully applied in this setting because they are a powerful tool to analyze data with large number of predictors and limited sample size, especially when handling binary outcomes. However, biomedical research often involves analysis of time-to-event outcomes and has to account for censoring. Methods to handle censored data in the SVM framework can be divided into two classes: those based on support vector regression (SVR) and those based on binary classification. Methods based on SVR seem to be suboptimal to handle sparse data and yield results comparable to Cox proportional hazards model and kernel Cox regression. The limited work dedicated to assess methods based on of SVM for binary classification has been based on SVM learning using privileged information and SVM with uncertain classes. Results: This paper proposes alternative methods and extensions within the binary classification framework, specifically, a conditional survival approach for weighting censored observations and a semi-supervised SVM with local invariances. Using simulation studies and some real datasets, we evaluate those two methods and compare them with a weighted SVM model, SVM extensions found in the literature, kernel Cox regression and Cox model. Conclusions: Our proposed methods perform generally better under a wide variety of realistic scenarios about the structure of biomedical data. Specifically, the local invariances method using the conditional survival approach is the most robust method under different scenarios and is a good approach to consider as an alternative to other time-to-event methods. When analysing real data is a method to be considered and recommended since outperforms other methods in proportional and non-proportional scenarios and sparse data, which is something usual in biomedical data and biomarkers analysis.
- dc.format.mimetype application/pdf
- dc.identifier.citation Sanz H, Reverter F, Valim C. Enhancing SVM for survival data using local invariances and weighting. BMC Bioinformatics. 2020 May 19; 21(1): 193. DOI:10.1186/s12859-020-3481-2
- dc.identifier.doi http://dx.doi.org/10.1186/s12859-020-3481-2
- dc.identifier.issn 1471-2105
- dc.identifier.uri http://hdl.handle.net/10230/45427
- dc.language.iso eng
- dc.publisher BioMed Central
- dc.relation.ispartof BMC Bioinformatics. 2020 May 19;21(1):193
- dc.rights © Hector Sanz, Ferran Reverter, Clarissa Valim. 2020, corrected publication 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by/4.0/
- dc.subject.other Màquines de vector de suport
- dc.subject.other Ciències de la salut -- Investigació
- dc.subject.other Marcadors bioquímics -- Anàlisi
- dc.title Enhancing SVM for survival data using local invariances and weighting
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion