A pitfall for machine learning methods aiming to predict across cell types
Abstract Machine learning models that predict genomic activity are most useful when they make accurate predictions across cell types. Here, we show that when the training and test sets contain the same genomic loci, the resulting model may falsely appear to perform well by effectively memorizing the...
Main Authors: | Jacob Schreiber, Ritambhara Singh, Jeffrey Bilmes, William Stafford Noble |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-11-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13059-020-02177-y |
Similar Items
-
Epiphany: predicting Hi-C contact maps from 1D epigenomic signals
by: Rui Yang, et al.
Published: (2023-06-01) -
Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples
by: Jacob Schreiber, et al.
Published: (2020-03-01) -
Marginalizing the genomic architecture to identify crosstalk across cancer and neurodegeneration
by: Amit Sharma, et al.
Published: (2023-02-01) -
Predicting liver cancer on epigenomics data using machine learning
by: Vishalkumar Vekariya, et al.
Published: (2022-09-01) -
Editorial: Evolution of crop genomes and epigenomes
by: Hai Du, et al.
Published: (2022-09-01)