Development of machine learning algorithms for screening of pulmonary disease

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.

Bibliographic Details
Main Author: Infante, Christian (Christian F.)
Other Authors: Richard R. Fletcher.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2018
Subjects:
Online Access:http://hdl.handle.net/1721.1/119525
_version_ 1826211294312660992
author Infante, Christian (Christian F.)
author2 Richard R. Fletcher.
author_facet Richard R. Fletcher.
Infante, Christian (Christian F.)
author_sort Infante, Christian (Christian F.)
collection MIT
description Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
first_indexed 2024-09-23T15:03:38Z
format Thesis
id mit-1721.1/119525
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T15:03:38Z
publishDate 2018
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1195252019-04-11T03:28:04Z Development of machine learning algorithms for screening of pulmonary disease Infante, Christian (Christian F.) Richard R. Fletcher. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 131-136). Pulmonary diseases are a leading cause of death worldwide. Much of their burden disproportionately affects the developing world. The MIT Mobile Technology Lab has developed a Mobile Kit which screens and diagnoses COPD and asthma. In this thesis, we analyze and further develop tools in this kit. All of the data for this thesis were collected as part of a large medical study with our partner, the Chest Research Foundation (CRF), in Pune, India. The data consisted of 325 patients (135 healthy, 76 asthma, 46 COPD, 29 allergic rhinitis, and 39 other). Among the asthma and COPD patients, 67 had allergic rhinitis. All patients were examined using a mobile diagnostic kit designed at MIT consisting of a mobile stethoscope, peak flow meter, and questionnaire. All patients were also examained using the convential gold standard pulmonary function testing (PFT) lab. The performance of our Mobile Kit platform was previously analyzed and presented in a prior Master's thesis. Building on our group's prior work, in this thesis we present three main contributions: 1) we have created a classifier for a new disease category, allergic rhinitis, which accounts for roughly half of all respiratory clinic patients; 2) we have explored and anlayzed the value of cough sounds as a diagnostic tools for pulmonary disease; and 3) we have analyzed data from a pulmonary function testing lab which were collected in parallel with our group's Mobile Diagnostic Kit, and have compared the performance. In the first section of this thesis, we created a classifier for allergic rhinitis diagnosis, using the same multi-layer classification structure as was used in our group's prior work. This integrated classifier demonstrated moderate performance with AUCs ranging from 0.87 to 0.90. As a second approach, a standalone classifier was also explored, which produced much better results, with an AUC of 0.96. Going forward, we plan to use an independent classifier as part of our diagnostics. In the second part of this thesis, we explored the value of cough sounds for pulmonary diagnosis. Various classifiers were created for the screening and diagnosis of pulmonary disease through the analysis of cough sounds. We first created a classifier for the detection of Wet and Dry coughs (which can indicate overall pulmonary health), which had a high classification performance but limited diagnostic value. We then explored the diagnostic value of specific physical features of the cough sounds, including kurtosis, variance, zero crossing irregularity, and rate of decay. the utility of these features were then analyzed both in isolation and integrated with other Mobile Kit tools. It was discovered that these cough sound features do have value as a simple diagnostic tool to distinguish between asthma and COPD, as well as basic pulmonary health; however, it was found that cough sounds alone provide less value than other diagnostic tools for providing disease-specific diagnosis. When integrated with the Mobile Kit tools, cough sounds only improved performance on lung sounds; otherwise, coughs did not have any added benefit. Given the ease of data collection, we demonstrated that cough sounds can play a role in simple disease screening for use with community health workers. For the third major part of this thesis, we did a thorough analysis of pulmonary function testing (PFT) data, which is the gold standard for pulmonary disease diagnosis. The PFT laboratory tools included spirometry, impulse oscillometry, body plethysmography, and lung gas diffusion (DLCO). We first explored a multi-layer classification structure. Using this structure, the PFT machines produced good results on each classification layer: Healthy vs. Unhealthy [AUC=0.90 (0.04)], Obstructive (Obs.) vs. Non-obstructive [AUC=0.95 (0.05)], Obs. AR vs. Obs. Non-AR [AUC=0.72 (0.10)], COPD + AR vs. Asthma + AR [AUC=0.95 (0.15)], COPD vs. Asthma [AUC=1.00 (0.04)], Non-Obs. AR vs. Non-Obs. Non-AR [AUC=0.92 (0.12)]. These results are only moderately better than the results yielded by our Mobile Diagnostic Kit: Healthy vs. Unhealthy [AUC=0.98 (0.02)], Obstructive (Obs.) vs. Non-obstructive [AUC=0.96 (0.04)], Obs. AR vs. Obs. Non-AR [AUC=0.90 (0.06)], COPD + AR vs. Asthma + AR [AUC=0.93 (0.09)], COPD vs. Asthma [AUC=1.00 (0.00)], Non-Obs. AR vs. Non-Obs. Non-AR [AUC=0.87 (0.12)]. Although these results are moderately good, the compounded error represents an unacceptable level of misclassification. As an alternative to the multi-layer classification structure, we explored the use of individual classifiers for each disease, which yielded much better results. For the PFT data, the individual classifiers produced the following results: asthma [AUC=0.96 (0.04)], COPD [AUC=0.99 (0.03)], and allergic rhinitis [AUC=0.74 (0.08)]. For the Mobile Kit data, the individual classifiers produced the following results: asthma [AUC=0.90 (0.05)], COPD [AUC=0.94 (0.05)], and allergic rhinitis [AUC=0.96 (0.03)]. In summary, building on our group's prior work, in this thesis we have expanded the capability of our Mobile Diagnostic Kit to include allergic rhinitis, as well as improved the diagnostic specificity to account for co-morbidities (asthma + AR, COPD + AR). Although our multi-layer classifier design has value in providing diagnostic insight and feedback to clinicians, we recommend that future versions of our Mobile Kit also include individual classifiers for specific disease categories (asthma, COPD, allergic rhinitis, asthma + AR, COPD + AR) in order to improve performance. by Christian Infante. M. Eng. 2018-12-11T20:38:46Z 2018-12-11T20:38:46Z 2017 2017 Thesis http://hdl.handle.net/1721.1/119525 1066694505 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 136 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Infante, Christian (Christian F.)
Development of machine learning algorithms for screening of pulmonary disease
title Development of machine learning algorithms for screening of pulmonary disease
title_full Development of machine learning algorithms for screening of pulmonary disease
title_fullStr Development of machine learning algorithms for screening of pulmonary disease
title_full_unstemmed Development of machine learning algorithms for screening of pulmonary disease
title_short Development of machine learning algorithms for screening of pulmonary disease
title_sort development of machine learning algorithms for screening of pulmonary disease
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/119525
work_keys_str_mv AT infantechristianchristianf developmentofmachinelearningalgorithmsforscreeningofpulmonarydisease