Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology
We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/22/16/8804 |
_version_ | 1797523570588385280 |
---|---|
author | Nicole Bussola Bruno Papa Ombretta Melaiu Aurora Castellano Doriana Fruci Giuseppe Jurman |
author_facet | Nicole Bussola Bruno Papa Ombretta Melaiu Aurora Castellano Doriana Fruci Giuseppe Jurman |
author_sort | Nicole Bussola |
collection | DOAJ |
description | We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform. |
first_indexed | 2024-03-10T08:44:50Z |
format | Article |
id | doaj.art-775660b94f7b49cc8b0173c3a615bfbb |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-10T08:44:50Z |
publishDate | 2021-08-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-775660b94f7b49cc8b0173c3a615bfbb2023-11-22T08:00:57ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672021-08-012216880410.3390/ijms22168804Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital PathologyNicole Bussola0Bruno Papa1Ombretta Melaiu2Aurora Castellano3Doriana Fruci4Giuseppe Jurman5Data Science for Health, Fondazione Bruno Kessler, 38123 Trento, ItalyData Science for Health, Fondazione Bruno Kessler, 38123 Trento, ItalyDepartment of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, ItalyDepartment of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, ItalyDepartment of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, ItalyData Science for Health, Fondazione Bruno Kessler, 38123 Trento, ItalyWe introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform.https://www.mdpi.com/1422-0067/22/16/8804neuroblastomadigital pathologyclassificationdeep learningtopological data analysis |
spellingShingle | Nicole Bussola Bruno Papa Ombretta Melaiu Aurora Castellano Doriana Fruci Giuseppe Jurman Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology International Journal of Molecular Sciences neuroblastoma digital pathology classification deep learning topological data analysis |
title | Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology |
title_full | Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology |
title_fullStr | Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology |
title_full_unstemmed | Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology |
title_short | Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology |
title_sort | quantification of the immune content in neuroblastoma deep learning and topological data analysis in digital pathology |
topic | neuroblastoma digital pathology classification deep learning topological data analysis |
url | https://www.mdpi.com/1422-0067/22/16/8804 |
work_keys_str_mv | AT nicolebussola quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology AT brunopapa quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology AT ombrettamelaiu quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology AT auroracastellano quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology AT dorianafruci quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology AT giuseppejurman quantificationoftheimmunecontentinneuroblastomadeeplearningandtopologicaldataanalysisindigitalpathology |