TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion
Mutation detecting is a routine work for sequencing data analysis and the trading of existing tools often involves the combinations of signals on a set of overlapped sequencing reads. However, the subclonal mutations, which are reported to contribute to tumor recurrence and metastasis, are sometimes...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-11-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2022.981269/full |
_version_ | 1811218929945673728 |
---|---|
author | Tian Zheng Tian Zheng |
author_facet | Tian Zheng Tian Zheng |
author_sort | Tian Zheng |
collection | DOAJ |
description | Mutation detecting is a routine work for sequencing data analysis and the trading of existing tools often involves the combinations of signals on a set of overlapped sequencing reads. However, the subclonal mutations, which are reported to contribute to tumor recurrence and metastasis, are sometimes eliminated by existing signals. When the clonal proportion decreases, signals often present ambiguous, while complicated interactions among signals break the IID assumption for most of the machine learning models. Although the mutation callers could lower the thresholds, false positives are significantly introduced. The main aim here was to detect the subclonal mutations with high specificity from the scenario of ambiguous sample purities or clonal proportions. We proposed a novel machine learning approach for filtering false positive calls to accurately detect mutations with wide spectrum subclonal proportion. We have carried out a series of experiments on both simulated and real datasets, and compared to several state-of-art approaches, including freebayes, MuTect2, Sentieon and SiNVICT. The results demonstrated that the proposed method adapts well to different diluted sequencing signals and can significantly reduce the false positive when detecting subclonal mutations. The codes have been uploaded at https://github.com/TrinaZ/TL-fpFilter for academic usage only. |
first_indexed | 2024-04-12T07:17:31Z |
format | Article |
id | doaj.art-b5698f09336b4a42bff74f011018d6f9 |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-12T07:17:31Z |
publishDate | 2022-11-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-b5698f09336b4a42bff74f011018d6f92022-12-22T03:42:26ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-11-011310.3389/fgene.2022.981269981269TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportionTian Zheng0Tian Zheng1Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, ChinaInstitute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, ChinaMutation detecting is a routine work for sequencing data analysis and the trading of existing tools often involves the combinations of signals on a set of overlapped sequencing reads. However, the subclonal mutations, which are reported to contribute to tumor recurrence and metastasis, are sometimes eliminated by existing signals. When the clonal proportion decreases, signals often present ambiguous, while complicated interactions among signals break the IID assumption for most of the machine learning models. Although the mutation callers could lower the thresholds, false positives are significantly introduced. The main aim here was to detect the subclonal mutations with high specificity from the scenario of ambiguous sample purities or clonal proportions. We proposed a novel machine learning approach for filtering false positive calls to accurately detect mutations with wide spectrum subclonal proportion. We have carried out a series of experiments on both simulated and real datasets, and compared to several state-of-art approaches, including freebayes, MuTect2, Sentieon and SiNVICT. The results demonstrated that the proposed method adapts well to different diluted sequencing signals and can significantly reduce the false positive when detecting subclonal mutations. The codes have been uploaded at https://github.com/TrinaZ/TL-fpFilter for academic usage only.https://www.frontiersin.org/articles/10.3389/fgene.2022.981269/fullgeneticsstructural variationmachine learningnext generation sequencingmutation detection |
spellingShingle | Tian Zheng Tian Zheng TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion Frontiers in Genetics genetics structural variation machine learning next generation sequencing mutation detection |
title | TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion |
title_full | TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion |
title_fullStr | TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion |
title_full_unstemmed | TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion |
title_short | TLsub: A transfer learning based enhancement to accurately detect mutations with wide-spectrum sub-clonal proportion |
title_sort | tlsub a transfer learning based enhancement to accurately detect mutations with wide spectrum sub clonal proportion |
topic | genetics structural variation machine learning next generation sequencing mutation detection |
url | https://www.frontiersin.org/articles/10.3389/fgene.2022.981269/full |
work_keys_str_mv | AT tianzheng tlsubatransferlearningbasedenhancementtoaccuratelydetectmutationswithwidespectrumsubclonalproportion AT tianzheng tlsubatransferlearningbasedenhancementtoaccuratelydetectmutationswithwidespectrumsubclonalproportion |