Interval Coded Scoring: a toolbox for interpretable scoring systems

Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which c...

Full description

Bibliographic Details
Main Authors: Lieven Billiet, Sabine Van Huffel, Vanya Van Belle
Format: Article
Language:English
Published: PeerJ Inc. 2018-04-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-150.pdf
_version_ 1818021675491393536
author Lieven Billiet
Sabine Van Huffel
Vanya Van Belle
author_facet Lieven Billiet
Sabine Van Huffel
Vanya Van Belle
author_sort Lieven Billiet
collection DOAJ
description Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue.
first_indexed 2024-04-14T08:20:59Z
format Article
id doaj.art-7765a31bed934e5fbaaf9e3851fe0012
institution Directory Open Access Journal
issn 2376-5992
language English
last_indexed 2024-04-14T08:20:59Z
publishDate 2018-04-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj.art-7765a31bed934e5fbaaf9e3851fe00122022-12-22T02:04:12ZengPeerJ Inc.PeerJ Computer Science2376-59922018-04-014e15010.7717/peerj-cs.150Interval Coded Scoring: a toolbox for interpretable scoring systemsLieven Billiet0Sabine Van Huffel1Vanya Van Belle2STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, BelgiumSTADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, BelgiumSTADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, BelgiumOver the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue.https://peerj.com/articles/cs-150.pdfDecision supportInterpretabilityScoring systemsSparse OptimizationClassificationRisk assessment
spellingShingle Lieven Billiet
Sabine Van Huffel
Vanya Van Belle
Interval Coded Scoring: a toolbox for interpretable scoring systems
PeerJ Computer Science
Decision support
Interpretability
Scoring systems
Sparse Optimization
Classification
Risk assessment
title Interval Coded Scoring: a toolbox for interpretable scoring systems
title_full Interval Coded Scoring: a toolbox for interpretable scoring systems
title_fullStr Interval Coded Scoring: a toolbox for interpretable scoring systems
title_full_unstemmed Interval Coded Scoring: a toolbox for interpretable scoring systems
title_short Interval Coded Scoring: a toolbox for interpretable scoring systems
title_sort interval coded scoring a toolbox for interpretable scoring systems
topic Decision support
Interpretability
Scoring systems
Sparse Optimization
Classification
Risk assessment
url https://peerj.com/articles/cs-150.pdf
work_keys_str_mv AT lievenbilliet intervalcodedscoringatoolboxforinterpretablescoringsystems
AT sabinevanhuffel intervalcodedscoringatoolboxforinterpretablescoringsystems
AT vanyavanbelle intervalcodedscoringatoolboxforinterpretablescoringsystems