Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection

Abstract Background Reproducibility of hits from independent CRISPR or siRNA screens is poor. This is partly due to data normalization primarily addressing technical variability within independent screens, and not the technical differences between them. Results We present “rscreenorm”, a method that...

Full description

Bibliographic Details
Main Authors: Costa Bachas, Jasmina Hodzic, Johannes C. van der Mijn, Chantal Stoepker, Henk M. W. Verheul, Rob M. F. Wolthuis, Emanuela Felley-Bosco, Wessel N. van Wieringen, Victor W. van Beusechem, Ruud H. Brakenhoff, Renée X. de Menezes
Format: Article
Language:English
Published: BMC 2018-08-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2306-z
_version_ 1819106568112701440
author Costa Bachas
Jasmina Hodzic
Johannes C. van der Mijn
Chantal Stoepker
Henk M. W. Verheul
Rob M. F. Wolthuis
Emanuela Felley-Bosco
Wessel N. van Wieringen
Victor W. van Beusechem
Ruud H. Brakenhoff
Renée X. de Menezes
author_facet Costa Bachas
Jasmina Hodzic
Johannes C. van der Mijn
Chantal Stoepker
Henk M. W. Verheul
Rob M. F. Wolthuis
Emanuela Felley-Bosco
Wessel N. van Wieringen
Victor W. van Beusechem
Ruud H. Brakenhoff
Renée X. de Menezes
author_sort Costa Bachas
collection DOAJ
description Abstract Background Reproducibility of hits from independent CRISPR or siRNA screens is poor. This is partly due to data normalization primarily addressing technical variability within independent screens, and not the technical differences between them. Results We present “rscreenorm”, a method that standardizes the functional data ranges between screens using assay controls, and subsequently performs a piecewise-linear normalization to make data distributions across all screens comparable. In simulation studies, rscreenorm reduces false positives. Using two multiple-cell lines siRNA screens, rscreenorm increased reproducibility between 27 and 62% for hits, and up to 5-fold for non-hits. Using publicly available CRISPR-Cas screen data, application of commonly used median centering yields merely 34% of overlapping hits, in contrast with rscreenorm yielding 84% of overlapping hits. Furthermore, rscreenorm yielded at most 8% discordant results, whilst median-centering yielded as much as 55%. Conclusions Rscreenorm yields more consistent results and keeps false positive rates under control, improving reproducibility of genetic screens data analysis from multiple cell lines.
first_indexed 2024-12-22T02:40:12Z
format Article
id doaj.art-826f3204ee2b48118fd68b46cacf334f
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-22T02:40:12Z
publishDate 2018-08-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-826f3204ee2b48118fd68b46cacf334f2022-12-21T18:41:40ZengBMCBMC Bioinformatics1471-21052018-08-0119111210.1186/s12859-018-2306-zRscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selectionCosta Bachas0Jasmina Hodzic1Johannes C. van der Mijn2Chantal Stoepker3Henk M. W. Verheul4Rob M. F. Wolthuis5Emanuela Felley-Bosco6Wessel N. van Wieringen7Victor W. van Beusechem8Ruud H. Brakenhoff9Renée X. de Menezes10Department of Otolaryngology - Head and Neck Surgery, Amsterdam UMC, Vrije Universiteit AmsterdamDepartment of Medical Oncology, Amsterdam UMC, Vrije Universiteit AmsterdamDepartment of Medical Oncology, Amsterdam UMC, Vrije Universiteit AmsterdamDivision of Tumor Biology and Immunology, Netherlands Cancer InstituteDepartment of Medical Oncology, Amsterdam UMC, Vrije Universiteit AmsterdamSection of Oncogenetics, Department of Clinical Genetics, Amsterdam UMC, Vrije Universiteit AmsterdamLaboratory of Molecular Oncology, University Hospital ZürichDepartment of Epidemiology and Biostatistics, Amsterdam UMC, Vrije Universiteit AmsterdamDepartment of Medical Oncology, Amsterdam UMC, Vrije Universiteit AmsterdamDepartment of Otolaryngology - Head and Neck Surgery, Amsterdam UMC, Vrije Universiteit AmsterdamDepartment of Epidemiology and Biostatistics, Amsterdam UMC, Vrije Universiteit AmsterdamAbstract Background Reproducibility of hits from independent CRISPR or siRNA screens is poor. This is partly due to data normalization primarily addressing technical variability within independent screens, and not the technical differences between them. Results We present “rscreenorm”, a method that standardizes the functional data ranges between screens using assay controls, and subsequently performs a piecewise-linear normalization to make data distributions across all screens comparable. In simulation studies, rscreenorm reduces false positives. Using two multiple-cell lines siRNA screens, rscreenorm increased reproducibility between 27 and 62% for hits, and up to 5-fold for non-hits. Using publicly available CRISPR-Cas screen data, application of commonly used median centering yields merely 34% of overlapping hits, in contrast with rscreenorm yielding 84% of overlapping hits. Furthermore, rscreenorm yielded at most 8% discordant results, whilst median-centering yielded as much as 55%. Conclusions Rscreenorm yields more consistent results and keeps false positive rates under control, improving reproducibility of genetic screens data analysis from multiple cell lines.http://link.springer.com/article/10.1186/s12859-018-2306-zFunctional genomicsReproducibilityNormalization
spellingShingle Costa Bachas
Jasmina Hodzic
Johannes C. van der Mijn
Chantal Stoepker
Henk M. W. Verheul
Rob M. F. Wolthuis
Emanuela Felley-Bosco
Wessel N. van Wieringen
Victor W. van Beusechem
Ruud H. Brakenhoff
Renée X. de Menezes
Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
BMC Bioinformatics
Functional genomics
Reproducibility
Normalization
title Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
title_full Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
title_fullStr Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
title_full_unstemmed Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
title_short Rscreenorm: normalization of CRISPR and siRNA screen data for more reproducible hit selection
title_sort rscreenorm normalization of crispr and sirna screen data for more reproducible hit selection
topic Functional genomics
Reproducibility
Normalization
url http://link.springer.com/article/10.1186/s12859-018-2306-z
work_keys_str_mv AT costabachas rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT jasminahodzic rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT johannescvandermijn rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT chantalstoepker rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT henkmwverheul rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT robmfwolthuis rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT emanuelafelleybosco rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT wesselnvanwieringen rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT victorwvanbeusechem rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT ruudhbrakenhoff rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection
AT reneexdemenezes rscreenormnormalizationofcrisprandsirnascreendataformorereproduciblehitselection