Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier
Functional doppelgängers (FDs) are independently derived sample pairs that confound machine learning model (ML) performance when assorted across training and validation sets. Here, we detail the use of doppelgangerIdentifier (DI), providing software installation, data preparation, doppelgänger ident...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/164598 |
_version_ | 1826112588052692992 |
---|---|
author | Wang, Li Rong Fan, Xiuyi Goh, Wilson Wen Bin |
author2 | School of Computer Science and Engineering |
author_facet | School of Computer Science and Engineering Wang, Li Rong Fan, Xiuyi Goh, Wilson Wen Bin |
author_sort | Wang, Li Rong |
collection | NTU |
description | Functional doppelgängers (FDs) are independently derived sample pairs that confound machine learning model (ML) performance when assorted across training and validation sets. Here, we detail the use of doppelgangerIdentifier (DI), providing software installation, data preparation, doppelgänger identification, and functional testing steps. We demonstrate examples with biomedical gene expression data. We also provide guidelines for the selection of user-defined function arguments. For complete details on the use and execution of this protocol, please refer to Wang et al. (2022). |
first_indexed | 2024-10-01T03:09:31Z |
format | Journal Article |
id | ntu-10356/164598 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T03:09:31Z |
publishDate | 2023 |
record_format | dspace |
spelling | ntu-10356/1645982023-02-28T17:13:49Z Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier Wang, Li Rong Fan, Xiuyi Goh, Wilson Wen Bin School of Computer Science and Engineering Lee Kong Chian School of Medicine (LKCMedicine) School of Biological Sciences Centre for Biomedical Informatics Engineering::Computer science and engineering Science::Biological sciences Gene Expression Machine Learning Functional doppelgängers (FDs) are independently derived sample pairs that confound machine learning model (ML) performance when assorted across training and validation sets. Here, we detail the use of doppelgangerIdentifier (DI), providing software installation, data preparation, doppelgänger identification, and functional testing steps. We demonstrate examples with biomedical gene expression data. We also provide guidelines for the selection of user-defined function arguments. For complete details on the use and execution of this protocol, please refer to Wang et al. (2022). Ministry of Education (MOE) National Research Foundation (NRF) Published version This research/project is supported by the National Research Foundation, Singapore under its Industry Alignment Fund – Pre-positioning (IAF-PP) Funding Initiative. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore. W.W.B.G. also acknowledges support from a Ministry of Education (MOE), Singapore Tier 1 grant (grant no. RG35/20). 2023-02-06T05:37:02Z 2023-02-06T05:37:02Z 2022 Journal Article Wang, L. R., Fan, X. & Goh, W. W. B. (2022). Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier. STAR Protocols, 3(4), 101783-. https://dx.doi.org/10.1016/j.xpro.2022.101783 2666-1667 https://hdl.handle.net/10356/164598 10.1016/j.xpro.2022.101783 36317174 2-s2.0-85140458047 4 3 101783 en RG35/20 STAR Protocols © 2022 The Author(s). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). application/pdf |
spellingShingle | Engineering::Computer science and engineering Science::Biological sciences Gene Expression Machine Learning Wang, Li Rong Fan, Xiuyi Goh, Wilson Wen Bin Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title | Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title_full | Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title_fullStr | Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title_full_unstemmed | Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title_short | Protocol to identify functional doppelgängers and verify biomedical gene expression data using doppelgangerIdentifier |
title_sort | protocol to identify functional doppelgangers and verify biomedical gene expression data using doppelgangeridentifier |
topic | Engineering::Computer science and engineering Science::Biological sciences Gene Expression Machine Learning |
url | https://hdl.handle.net/10356/164598 |
work_keys_str_mv | AT wanglirong protocoltoidentifyfunctionaldoppelgangersandverifybiomedicalgeneexpressiondatausingdoppelgangeridentifier AT fanxiuyi protocoltoidentifyfunctionaldoppelgangersandverifybiomedicalgeneexpressiondatausingdoppelgangeridentifier AT gohwilsonwenbin protocoltoidentifyfunctionaldoppelgangersandverifybiomedicalgeneexpressiondatausingdoppelgangeridentifier |