Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots

<p>Abstract</p> <p>Background</p> <p>An important and yet rather neglected question related to bioinformatics predictions is the estimation of the amount of data that is needed to allow reliable predictions. Bioinformatics predictions are usually validated through a ser...

Full description

Bibliographic Details
Main Author: Carugo Oliviero
Format: Article
Language:English
Published: BMC 2007-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/380
_version_ 1818835065220628480
author Carugo Oliviero
author_facet Carugo Oliviero
author_sort Carugo Oliviero
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>An important and yet rather neglected question related to bioinformatics predictions is the estimation of the amount of data that is needed to allow reliable predictions. Bioinformatics predictions are usually validated through a series of figures of merit, like for example sensitivity and precision, and little attention is paid to the fact that their performance may depend on the amount of data used to make the predictions themselves.</p> <p>Results</p> <p>Here I describe a tool, named Fragmented Prediction Performance Plot (FPPP), which monitors the relationship between the prediction reliability and the amount of information underling the prediction themselves. Three examples of FPPPs are presented to illustrate their principal features. In one example, the reliability becomes independent, over a certain threshold, of the amount of data used to predict protein features and the intrinsic reliability of the predictor can be estimated. In the other two cases, on the contrary, the reliability strongly depends on the amount of data used to make the predictions and, thus, the intrinsic reliability of the two predictors cannot be determined. Only in the first example it is thus possible to fully quantify the prediction performance.</p> <p>Conclusion</p> <p>It is thus highly advisable to use FPPPs to determine the performance of any new bioinformatics prediction protocol, in order to fully quantify its prediction power and to allow comparisons between two or more predictors based on different types of data.</p>
first_indexed 2024-12-19T02:44:47Z
format Article
id doaj.art-6ca62a4a29e240018a4eee2a32649e8b
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-19T02:44:47Z
publishDate 2007-10-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-6ca62a4a29e240018a4eee2a32649e8b2022-12-21T20:38:58ZengBMCBMC Bioinformatics1471-21052007-10-018138010.1186/1471-2105-8-380Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance PlotsCarugo Oliviero<p>Abstract</p> <p>Background</p> <p>An important and yet rather neglected question related to bioinformatics predictions is the estimation of the amount of data that is needed to allow reliable predictions. Bioinformatics predictions are usually validated through a series of figures of merit, like for example sensitivity and precision, and little attention is paid to the fact that their performance may depend on the amount of data used to make the predictions themselves.</p> <p>Results</p> <p>Here I describe a tool, named Fragmented Prediction Performance Plot (FPPP), which monitors the relationship between the prediction reliability and the amount of information underling the prediction themselves. Three examples of FPPPs are presented to illustrate their principal features. In one example, the reliability becomes independent, over a certain threshold, of the amount of data used to predict protein features and the intrinsic reliability of the predictor can be estimated. In the other two cases, on the contrary, the reliability strongly depends on the amount of data used to make the predictions and, thus, the intrinsic reliability of the two predictors cannot be determined. Only in the first example it is thus possible to fully quantify the prediction performance.</p> <p>Conclusion</p> <p>It is thus highly advisable to use FPPPs to determine the performance of any new bioinformatics prediction protocol, in order to fully quantify its prediction power and to allow comparisons between two or more predictors based on different types of data.</p>http://www.biomedcentral.com/1471-2105/8/380
spellingShingle Carugo Oliviero
Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
BMC Bioinformatics
title Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
title_full Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
title_fullStr Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
title_full_unstemmed Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
title_short Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots
title_sort detailed estimation of bioinformatics prediction reliability through the fragmented prediction performance plots
url http://www.biomedcentral.com/1471-2105/8/380
work_keys_str_mv AT carugooliviero detailedestimationofbioinformaticspredictionreliabilitythroughthefragmentedpredictionperformanceplots