Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning

Abstract Background Low-dose spiral computed tomography (LDCT) may not lead to a clear treatment path when small to intermediate-sized lung nodules are identified. We have combined flow cytometry and machine learning to develop a sputum-based test (CyPath Lung) that can assist physicians in decision...

Full description

Bibliographic Details
Main Authors: Madeleine E. Lemieux, Xavier T. Reveles, Jennifer Rebeles, Lydia H. Bederka, Patricia R. Araujo, Jamila R. Sanchez, Marcia Grayson, Shao-Chiang Lai, Louis R. DePalo, Sheila A. Habib, David G. Hill, Kathleen Lopez, Lara Patriquin, Robert Sussman, Roby P. Joyce, Vivienne I. Rebel
Format: Article
Language:English
Published: BMC 2023-01-01
Series:Respiratory Research
Subjects:
Online Access:https://doi.org/10.1186/s12931-023-02327-3
_version_ 1797945771477172224
author Madeleine E. Lemieux
Xavier T. Reveles
Jennifer Rebeles
Lydia H. Bederka
Patricia R. Araujo
Jamila R. Sanchez
Marcia Grayson
Shao-Chiang Lai
Louis R. DePalo
Sheila A. Habib
David G. Hill
Kathleen Lopez
Lara Patriquin
Robert Sussman
Roby P. Joyce
Vivienne I. Rebel
author_facet Madeleine E. Lemieux
Xavier T. Reveles
Jennifer Rebeles
Lydia H. Bederka
Patricia R. Araujo
Jamila R. Sanchez
Marcia Grayson
Shao-Chiang Lai
Louis R. DePalo
Sheila A. Habib
David G. Hill
Kathleen Lopez
Lara Patriquin
Robert Sussman
Roby P. Joyce
Vivienne I. Rebel
author_sort Madeleine E. Lemieux
collection DOAJ
description Abstract Background Low-dose spiral computed tomography (LDCT) may not lead to a clear treatment path when small to intermediate-sized lung nodules are identified. We have combined flow cytometry and machine learning to develop a sputum-based test (CyPath Lung) that can assist physicians in decision-making in such cases. Methods Single cell suspensions prepared from induced sputum samples collected over three consecutive days were labeled with a viability dye to exclude dead cells, antibodies to distinguish cell types, and a porphyrin to label cancer-associated cells. The labeled cell suspension was run on a flow cytometer and the data collected. An analysis pipeline combining automated flow cytometry data processing with machine learning was developed to distinguish cancer from non-cancer samples from 150 patients at high risk of whom 28 had lung cancer. Flow data and patient features were evaluated to identify predictors of lung cancer. Random training and test sets were chosen to evaluate predictive variables iteratively until a robust model was identified. The final model was tested on a second, independent group of 32 samples, including six samples from patients diagnosed with lung cancer. Results Automated analysis combined with machine learning resulted in a predictive model that achieved an area under the ROC curve (AUC) of 0.89 (95% CI 0.83–0.89). The sensitivity and specificity were 82% and 88%, respectively, and the negative and positive predictive values 96% and 61%, respectively. Importantly, the test was 92% sensitive and 87% specific in cases when nodules were < 20 mm (AUC of 0.94; 95% CI 0.89–0.99). Testing of the model on an independent second set of samples showed an AUC of 0.85 (95% CI 0.71–0.98) with an 83% sensitivity, 77% specificity, 95% negative predictive value and 45% positive predictive value. The model is robust to differences in sample processing and disease state. Conclusion CyPath Lung correctly classifies samples as cancer or non-cancer with high accuracy, including from participants at different disease stages and with nodules < 20 mm in diameter. This test is intended for use after lung cancer screening to improve early-stage lung cancer diagnosis. Trial registration ClinicalTrials.gov ID: NCT03457415; March 7, 2018
first_indexed 2024-04-10T21:00:24Z
format Article
id doaj.art-d3f1133571fa41c98dbfa1a17abe84ab
institution Directory Open Access Journal
issn 1465-993X
language English
last_indexed 2024-04-10T21:00:24Z
publishDate 2023-01-01
publisher BMC
record_format Article
series Respiratory Research
spelling doaj.art-d3f1133571fa41c98dbfa1a17abe84ab2023-01-22T12:22:44ZengBMCRespiratory Research1465-993X2023-01-0124111610.1186/s12931-023-02327-3Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learningMadeleine E. Lemieux0Xavier T. Reveles1Jennifer Rebeles2Lydia H. Bederka3Patricia R. Araujo4Jamila R. Sanchez5Marcia Grayson6Shao-Chiang Lai7Louis R. DePalo8Sheila A. Habib9David G. Hill10Kathleen Lopez11Lara Patriquin12Robert Sussman13Roby P. Joyce14Vivienne I. Rebel15BioinfobioAffinity TechnologiesbioAffinity TechnologiesbioAffinity TechnologiesbioAffinity TechnologiesbioAffinity TechnologiesbioAffinity TechnologiesbioAffinity TechnologiesDepartment of Medicine, Icahn School of Medicine at Mount SinaiSouth Texas Veterans Health Care System (STVHCS), Audie L. Murphy Memorial Veterans HospitalWaterbury Pulmonary Associates LLCRadiology Associates of AlbuquerqueRadiology Associates of AlbuquerqueAtlantic Respiratory InstitutePrecision Pathology ServicesbioAffinity TechnologiesAbstract Background Low-dose spiral computed tomography (LDCT) may not lead to a clear treatment path when small to intermediate-sized lung nodules are identified. We have combined flow cytometry and machine learning to develop a sputum-based test (CyPath Lung) that can assist physicians in decision-making in such cases. Methods Single cell suspensions prepared from induced sputum samples collected over three consecutive days were labeled with a viability dye to exclude dead cells, antibodies to distinguish cell types, and a porphyrin to label cancer-associated cells. The labeled cell suspension was run on a flow cytometer and the data collected. An analysis pipeline combining automated flow cytometry data processing with machine learning was developed to distinguish cancer from non-cancer samples from 150 patients at high risk of whom 28 had lung cancer. Flow data and patient features were evaluated to identify predictors of lung cancer. Random training and test sets were chosen to evaluate predictive variables iteratively until a robust model was identified. The final model was tested on a second, independent group of 32 samples, including six samples from patients diagnosed with lung cancer. Results Automated analysis combined with machine learning resulted in a predictive model that achieved an area under the ROC curve (AUC) of 0.89 (95% CI 0.83–0.89). The sensitivity and specificity were 82% and 88%, respectively, and the negative and positive predictive values 96% and 61%, respectively. Importantly, the test was 92% sensitive and 87% specific in cases when nodules were < 20 mm (AUC of 0.94; 95% CI 0.89–0.99). Testing of the model on an independent second set of samples showed an AUC of 0.85 (95% CI 0.71–0.98) with an 83% sensitivity, 77% specificity, 95% negative predictive value and 45% positive predictive value. The model is robust to differences in sample processing and disease state. Conclusion CyPath Lung correctly classifies samples as cancer or non-cancer with high accuracy, including from participants at different disease stages and with nodules < 20 mm in diameter. This test is intended for use after lung cancer screening to improve early-stage lung cancer diagnosis. Trial registration ClinicalTrials.gov ID: NCT03457415; March 7, 2018https://doi.org/10.1186/s12931-023-02327-3SputumAutomated flow cytometryMachine learningPorphyrinEarly-stage lung cancer
spellingShingle Madeleine E. Lemieux
Xavier T. Reveles
Jennifer Rebeles
Lydia H. Bederka
Patricia R. Araujo
Jamila R. Sanchez
Marcia Grayson
Shao-Chiang Lai
Louis R. DePalo
Sheila A. Habib
David G. Hill
Kathleen Lopez
Lara Patriquin
Robert Sussman
Roby P. Joyce
Vivienne I. Rebel
Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
Respiratory Research
Sputum
Automated flow cytometry
Machine learning
Porphyrin
Early-stage lung cancer
title Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
title_full Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
title_fullStr Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
title_full_unstemmed Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
title_short Detection of early-stage lung cancer in sputum using automated flow cytometry and machine learning
title_sort detection of early stage lung cancer in sputum using automated flow cytometry and machine learning
topic Sputum
Automated flow cytometry
Machine learning
Porphyrin
Early-stage lung cancer
url https://doi.org/10.1186/s12931-023-02327-3
work_keys_str_mv AT madeleineelemieux detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT xaviertreveles detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT jenniferrebeles detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT lydiahbederka detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT patriciararaujo detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT jamilarsanchez detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT marciagrayson detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT shaochianglai detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT louisrdepalo detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT sheilaahabib detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT davidghill detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT kathleenlopez detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT larapatriquin detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT robertsussman detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT robypjoyce detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning
AT vivienneirebel detectionofearlystagelungcancerinsputumusingautomatedflowcytometryandmachinelearning