Integrative missing value estimation for microarray data

Abstract Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data,...

Full description

Bibliographic Details
Main Authors:	Zhou Xianghong, Waterman Michael S, Li Haifeng, Hu Jianjun
Format:	Article
Language:	English
Published:	BMC 2006-10-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/7/449

_version_	1811251015222034432
author	Zhou Xianghong Waterman Michael S Li Haifeng Hu Jianjun
author_facet	Zhou Xianghong Waterman Michael S Li Haifeng Hu Jianjun
author_sort	Zhou Xianghong
collection	DOAJ
description	<p>Abstract</p> <p>Background</p> <p>Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples.</p> <p>Results</p> <p>We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests.</p> <p>Conclusion</p> <p>We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.</p>
first_indexed	2024-04-12T16:13:17Z
format	Article
id	doaj.art-fea8c1962b344c0693740c3a028e44d7
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-04-12T16:13:17Z
publishDate	2006-10-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-fea8c1962b344c0693740c3a028e44d72022-12-22T03:25:50ZengBMCBMC Bioinformatics1471-21052006-10-017144910.1186/1471-2105-7-449Integrative missing value estimation for microarray dataZhou XianghongWaterman Michael SLi HaifengHu Jianjun<p>Abstract</p> <p>Background</p> <p>Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples.</p> <p>Results</p> <p>We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests.</p> <p>Conclusion</p> <p>We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.</p>http://www.biomedcentral.com/1471-2105/7/449
spellingShingle	Zhou Xianghong Waterman Michael S Li Haifeng Hu Jianjun Integrative missing value estimation for microarray data BMC Bioinformatics
title	Integrative missing value estimation for microarray data
title_full	Integrative missing value estimation for microarray data
title_fullStr	Integrative missing value estimation for microarray data
title_full_unstemmed	Integrative missing value estimation for microarray data
title_short	Integrative missing value estimation for microarray data
title_sort	integrative missing value estimation for microarray data
url	http://www.biomedcentral.com/1471-2105/7/449
work_keys_str_mv	AT zhouxianghong integrativemissingvalueestimationformicroarraydata AT watermanmichaels integrativemissingvalueestimationformicroarraydata AT lihaifeng integrativemissingvalueestimationformicroarraydata AT hujianjun integrativemissingvalueestimationformicroarraydata

Integrative missing value estimation for microarray data

Similar Items