Application of machine learning techniques to tuberculosis drug resistance analysis

<strong>Motivation</strong> Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resis...

Full description

Bibliographic Details
Main Authors: Kouchaki, S, Yang, Y, Walker, T, Walker, A, Wilson, D, Peto, T, Crook, D, Clifton, D, Cryptic Consortium
Format: Journal article
Language:English
Published: Oxford University Press 2018
_version_ 1826287547840462848
author Kouchaki, S
Yang, Y
Walker, T
Walker, A
Wilson, D
Peto, T
Crook, D
Clifton, D
Cryptic Consortium,
author_facet Kouchaki, S
Yang, Y
Walker, T
Walker, A
Wilson, D
Peto, T
Crook, D
Clifton, D
Cryptic Consortium,
author_sort Kouchaki, S
collection OXFORD
description <strong>Motivation</strong> Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resistance of MTB given a specific drug and identifying resistance markers. However, they have been not validated on a large cohort of MTB samples from multi-centers across the world in terms of resistance prediction and resistance marker identification. <br/><br/> <strong>Summary</strong> Several machine learning classifiers and linear dimension reduction techniques were developed and compared for a cohort of 13402 isolates collected from 16 countries across six continents and tested 11 drugs. <br/><br/> <strong>Results</strong> Compared to conventional molecular diagnostic test, area under curve (AUC) of the best machine learning classifier increased for all drugs especially by 23.11%, 15.22%, and 10.14% for pyrazinamide (PZA), ciprofloxacin (CIP), and ofloxacin (OFX) respectively (p &lt; 0.01). Logistic regression (LR) and gradient tree boosting (GBT) found to perform better than other techniques. Moreover, LR/GBT with a sparse principal component analysis/non-negative matrix factorisation step compared with the classifier alone enhanced the best performance in terms of F1-score by 12.54%, 4.61%, 7.45%, and 9.58% for amikacin (AK), moxifloxacin (MOX), OFX, and capreomycin (CAP) respectively, as well increasing AUC for AK and CAP. Results provided a comprehensive comparison of various techniques and confirmed the application of machine learning for better prediction of the large diverse TB data. Furthermore, mutation ranking showed the possibility of finding new resistance/susceptible markers.
first_indexed 2024-03-07T02:00:19Z
format Journal article
id oxford-uuid:9d278dc8-5074-4f6e-9b5e-c218e805727e
institution University of Oxford
language English
last_indexed 2024-03-07T02:00:19Z
publishDate 2018
publisher Oxford University Press
record_format dspace
spelling oxford-uuid:9d278dc8-5074-4f6e-9b5e-c218e805727e2022-03-27T00:41:00ZApplication of machine learning techniques to tuberculosis drug resistance analysisJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:9d278dc8-5074-4f6e-9b5e-c218e805727eEnglishSymplectic Elements at OxfordOxford University Press2018Kouchaki, SYang, YWalker, TWalker, AWilson, DPeto, TCrook, DClifton, DCryptic Consortium,<strong>Motivation</strong> Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resistance of MTB given a specific drug and identifying resistance markers. However, they have been not validated on a large cohort of MTB samples from multi-centers across the world in terms of resistance prediction and resistance marker identification. <br/><br/> <strong>Summary</strong> Several machine learning classifiers and linear dimension reduction techniques were developed and compared for a cohort of 13402 isolates collected from 16 countries across six continents and tested 11 drugs. <br/><br/> <strong>Results</strong> Compared to conventional molecular diagnostic test, area under curve (AUC) of the best machine learning classifier increased for all drugs especially by 23.11%, 15.22%, and 10.14% for pyrazinamide (PZA), ciprofloxacin (CIP), and ofloxacin (OFX) respectively (p &lt; 0.01). Logistic regression (LR) and gradient tree boosting (GBT) found to perform better than other techniques. Moreover, LR/GBT with a sparse principal component analysis/non-negative matrix factorisation step compared with the classifier alone enhanced the best performance in terms of F1-score by 12.54%, 4.61%, 7.45%, and 9.58% for amikacin (AK), moxifloxacin (MOX), OFX, and capreomycin (CAP) respectively, as well increasing AUC for AK and CAP. Results provided a comprehensive comparison of various techniques and confirmed the application of machine learning for better prediction of the large diverse TB data. Furthermore, mutation ranking showed the possibility of finding new resistance/susceptible markers.
spellingShingle Kouchaki, S
Yang, Y
Walker, T
Walker, A
Wilson, D
Peto, T
Crook, D
Clifton, D
Cryptic Consortium,
Application of machine learning techniques to tuberculosis drug resistance analysis
title Application of machine learning techniques to tuberculosis drug resistance analysis
title_full Application of machine learning techniques to tuberculosis drug resistance analysis
title_fullStr Application of machine learning techniques to tuberculosis drug resistance analysis
title_full_unstemmed Application of machine learning techniques to tuberculosis drug resistance analysis
title_short Application of machine learning techniques to tuberculosis drug resistance analysis
title_sort application of machine learning techniques to tuberculosis drug resistance analysis
work_keys_str_mv AT kouchakis applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT yangy applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT walkert applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT walkera applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT wilsond applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT petot applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT crookd applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT cliftond applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis
AT crypticconsortium applicationofmachinelearningtechniquestotuberculosisdrugresistanceanalysis