Alternative empirical Bayes models for adjusting for batch effects in genomic studies

Abstract Background Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across m...

Full description

Bibliographic Details
Main Authors:	Yuqing Zhang, David F. Jenkins, Solaiappan Manimaran, W. Evan Johnson
Format:	Article
Language:	English
Published:	BMC 2018-07-01
Series:	BMC Bioinformatics
Subjects:	Batch effects Empirical Bayes models Data integration Biomarker development
Online Access:	http://link.springer.com/article/10.1186/s12859-018-2263-6

_version_	1828480867880665088
author	Yuqing Zhang David F. Jenkins Solaiappan Manimaran W. Evan Johnson
author_facet	Yuqing Zhang David F. Jenkins Solaiappan Manimaran W. Evan Johnson
author_sort	Yuqing Zhang
collection	DOAJ
description	Abstract Background Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple batches of data that are generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These so-called batch effects often confound true biological relationships in the data, reducing the power benefits of combining multiple batches, and may even lead to spurious results in some combined studies. Therefore there is significant need for effective methods and software tools that account for batch effects in high-throughput genomic studies. Results Here we contribute multiple methods and software tools for improved combination and analysis of data from multiple batches. In particular, we provide batch effect solutions for cases where the severity of the batch effects is not extreme, and for cases where one high-quality batch can serve as a reference, such as the training set in a biomarker study. We illustrate our approaches and software in both simulated and real data scenarios. Conclusions We demonstrate the value of these new contributions compared to currently established approaches in the specified batch correction situations.
first_indexed	2024-12-11T07:46:43Z
format	Article
id	doaj.art-11c13fcbc012456e85f49cae87e51a9f
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-12-11T07:46:43Z
publishDate	2018-07-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-11c13fcbc012456e85f49cae87e51a9f2022-12-22T01:15:27ZengBMCBMC Bioinformatics1471-21052018-07-0119111510.1186/s12859-018-2263-6Alternative empirical Bayes models for adjusting for batch effects in genomic studiesYuqing Zhang0David F. Jenkins1Solaiappan Manimaran2W. Evan Johnson3Division of Computational Biomedicine, Boston University School of MedicineDivision of Computational Biomedicine, Boston University School of MedicineDivision of Computational Biomedicine, Boston University School of MedicineDivision of Computational Biomedicine, Boston University School of MedicineAbstract Background Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple batches of data that are generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These so-called batch effects often confound true biological relationships in the data, reducing the power benefits of combining multiple batches, and may even lead to spurious results in some combined studies. Therefore there is significant need for effective methods and software tools that account for batch effects in high-throughput genomic studies. Results Here we contribute multiple methods and software tools for improved combination and analysis of data from multiple batches. In particular, we provide batch effect solutions for cases where the severity of the batch effects is not extreme, and for cases where one high-quality batch can serve as a reference, such as the training set in a biomarker study. We illustrate our approaches and software in both simulated and real data scenarios. Conclusions We demonstrate the value of these new contributions compared to currently established approaches in the specified batch correction situations.http://link.springer.com/article/10.1186/s12859-018-2263-6Batch effectsEmpirical Bayes modelsData integrationBiomarker development
spellingShingle	Yuqing Zhang David F. Jenkins Solaiappan Manimaran W. Evan Johnson Alternative empirical Bayes models for adjusting for batch effects in genomic studies BMC Bioinformatics Batch effects Empirical Bayes models Data integration Biomarker development
title	Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_full	Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_fullStr	Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_full_unstemmed	Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_short	Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_sort	alternative empirical bayes models for adjusting for batch effects in genomic studies
topic	Batch effects Empirical Bayes models Data integration Biomarker development
url	http://link.springer.com/article/10.1186/s12859-018-2263-6
work_keys_str_mv	AT yuqingzhang alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies AT davidfjenkins alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies AT solaiappanmanimaran alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies AT wevanjohnson alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies

Alternative empirical Bayes models for adjusting for batch effects in genomic studies

Similar Items