Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier

Abstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches th...

Full description

Bibliographic Details
Main Authors: James Lara, Mahder Teka, Yury Khudyakov
Format: Article
Language:English
Published: BMC 2017-12-01
Series:BMC Genomics
Online Access:http://link.springer.com/article/10.1186/s12864-017-4269-2
_version_ 1818967149416284160
author James Lara
Mahder Teka
Yury Khudyakov
author_facet James Lara
Mahder Teka
Yury Khudyakov
author_sort James Lara
collection DOAJ
description Abstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches that are available for identification of acute HCV infection. However, owing to complexity, both approaches are not efficient. Here, we describe a new sequence alignment-free method to discriminate between recent (R) and chronic (C) HCV infection using next-generation sequencing (NGS) data derived from the HCV hypervariable region 1 (HVR1). Results Using dinucleotide auto correlation (DAC), we identified physical-chemical (PhyChem) features of HVR1 variants. Significant (p < 9.58 × 10−4) differences in the means and frequency distributions of PhyChem features were found between HVR1 variants sampled from patients with recent vs chronic (R/C) infection. Moreover, the R-associated variants were found to occupy distinct and discrete PhyChem spaces. A radial basis function neural network classifier trained on the PhyChem features of intra-host HVR1 variants accurately classified R/C-HVR1 variants (classification accuracy (CA) = 94.85%; area under the ROC curve, AUROC = 0.979), in 10-fold cross-validation). The classifier was accurate in assigning individual HVR1 variants to R/C-classes in the testing set (CA = 84.15%; AUROC = 0.912) and in detection of infection duration (R/C-class) in patients (CA = 88.45%). Statistical tests and evaluation of the classifier on randomly-labeled datasets indicate that classifiers’ CA is robust (p < 0.001) and unlikely due to random correlations (CA = 59.04% and AUROC = 0.50). Conclusions The PhyChem features of intra-host HVR1 variants are strongly associated with the duration of HCV infection. Application of the PhyChem biomarkers to models for detection of the R/C-state of HCV infection in patients offers a new opportunity for detection of outbreaks and for molecular surveillance. The method will be available at https://webappx.cdc.gov/GHOST/ to the authenticated users of Global Hepatitis Outbreak and Surveillance Technology (GHOST) for further testing and validation.
first_indexed 2024-12-20T13:44:12Z
format Article
id doaj.art-3a03595efd07433c91008df43212501a
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-20T13:44:12Z
publishDate 2017-12-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-3a03595efd07433c91008df43212501a2022-12-21T19:38:44ZengBMCBMC Genomics1471-21642017-12-0118S10334210.1186/s12864-017-4269-2Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifierJames Lara0Mahder Teka1Yury Khudyakov2Division of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionDivision of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionDivision of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionAbstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches that are available for identification of acute HCV infection. However, owing to complexity, both approaches are not efficient. Here, we describe a new sequence alignment-free method to discriminate between recent (R) and chronic (C) HCV infection using next-generation sequencing (NGS) data derived from the HCV hypervariable region 1 (HVR1). Results Using dinucleotide auto correlation (DAC), we identified physical-chemical (PhyChem) features of HVR1 variants. Significant (p < 9.58 × 10−4) differences in the means and frequency distributions of PhyChem features were found between HVR1 variants sampled from patients with recent vs chronic (R/C) infection. Moreover, the R-associated variants were found to occupy distinct and discrete PhyChem spaces. A radial basis function neural network classifier trained on the PhyChem features of intra-host HVR1 variants accurately classified R/C-HVR1 variants (classification accuracy (CA) = 94.85%; area under the ROC curve, AUROC = 0.979), in 10-fold cross-validation). The classifier was accurate in assigning individual HVR1 variants to R/C-classes in the testing set (CA = 84.15%; AUROC = 0.912) and in detection of infection duration (R/C-class) in patients (CA = 88.45%). Statistical tests and evaluation of the classifier on randomly-labeled datasets indicate that classifiers’ CA is robust (p < 0.001) and unlikely due to random correlations (CA = 59.04% and AUROC = 0.50). Conclusions The PhyChem features of intra-host HVR1 variants are strongly associated with the duration of HCV infection. Application of the PhyChem biomarkers to models for detection of the R/C-state of HCV infection in patients offers a new opportunity for detection of outbreaks and for molecular surveillance. The method will be available at https://webappx.cdc.gov/GHOST/ to the authenticated users of Global Hepatitis Outbreak and Surveillance Technology (GHOST) for further testing and validation.http://link.springer.com/article/10.1186/s12864-017-4269-2
spellingShingle James Lara
Mahder Teka
Yury Khudyakov
Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
BMC Genomics
title Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
title_full Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
title_fullStr Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
title_full_unstemmed Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
title_short Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
title_sort identification of recent cases of hepatitis c virus infection using physical chemical properties of hypervariable region 1 and a radial basis function neural network classifier
url http://link.springer.com/article/10.1186/s12864-017-4269-2
work_keys_str_mv AT jameslara identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier
AT mahderteka identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier
AT yurykhudyakov identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier