Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier
Abstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches th...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-12-01
|
Series: | BMC Genomics |
Online Access: | http://link.springer.com/article/10.1186/s12864-017-4269-2 |
_version_ | 1818967149416284160 |
---|---|
author | James Lara Mahder Teka Yury Khudyakov |
author_facet | James Lara Mahder Teka Yury Khudyakov |
author_sort | James Lara |
collection | DOAJ |
description | Abstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches that are available for identification of acute HCV infection. However, owing to complexity, both approaches are not efficient. Here, we describe a new sequence alignment-free method to discriminate between recent (R) and chronic (C) HCV infection using next-generation sequencing (NGS) data derived from the HCV hypervariable region 1 (HVR1). Results Using dinucleotide auto correlation (DAC), we identified physical-chemical (PhyChem) features of HVR1 variants. Significant (p < 9.58 × 10−4) differences in the means and frequency distributions of PhyChem features were found between HVR1 variants sampled from patients with recent vs chronic (R/C) infection. Moreover, the R-associated variants were found to occupy distinct and discrete PhyChem spaces. A radial basis function neural network classifier trained on the PhyChem features of intra-host HVR1 variants accurately classified R/C-HVR1 variants (classification accuracy (CA) = 94.85%; area under the ROC curve, AUROC = 0.979), in 10-fold cross-validation). The classifier was accurate in assigning individual HVR1 variants to R/C-classes in the testing set (CA = 84.15%; AUROC = 0.912) and in detection of infection duration (R/C-class) in patients (CA = 88.45%). Statistical tests and evaluation of the classifier on randomly-labeled datasets indicate that classifiers’ CA is robust (p < 0.001) and unlikely due to random correlations (CA = 59.04% and AUROC = 0.50). Conclusions The PhyChem features of intra-host HVR1 variants are strongly associated with the duration of HCV infection. Application of the PhyChem biomarkers to models for detection of the R/C-state of HCV infection in patients offers a new opportunity for detection of outbreaks and for molecular surveillance. The method will be available at https://webappx.cdc.gov/GHOST/ to the authenticated users of Global Hepatitis Outbreak and Surveillance Technology (GHOST) for further testing and validation. |
first_indexed | 2024-12-20T13:44:12Z |
format | Article |
id | doaj.art-3a03595efd07433c91008df43212501a |
institution | Directory Open Access Journal |
issn | 1471-2164 |
language | English |
last_indexed | 2024-12-20T13:44:12Z |
publishDate | 2017-12-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj.art-3a03595efd07433c91008df43212501a2022-12-21T19:38:44ZengBMCBMC Genomics1471-21642017-12-0118S10334210.1186/s12864-017-4269-2Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifierJames Lara0Mahder Teka1Yury Khudyakov2Division of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionDivision of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionDivision of Viral Hepatitis, National Center for HIV, Hepatitis, TB and STD Prevention, Centers for Disease Control and PreventionAbstract Background Identification of acute or recent hepatitis C virus (HCV) infections is important for detecting outbreaks and devising timely public health interventions for interruption of transmission. Epidemiological investigations and chemistry-based laboratory tests are 2 main approaches that are available for identification of acute HCV infection. However, owing to complexity, both approaches are not efficient. Here, we describe a new sequence alignment-free method to discriminate between recent (R) and chronic (C) HCV infection using next-generation sequencing (NGS) data derived from the HCV hypervariable region 1 (HVR1). Results Using dinucleotide auto correlation (DAC), we identified physical-chemical (PhyChem) features of HVR1 variants. Significant (p < 9.58 × 10−4) differences in the means and frequency distributions of PhyChem features were found between HVR1 variants sampled from patients with recent vs chronic (R/C) infection. Moreover, the R-associated variants were found to occupy distinct and discrete PhyChem spaces. A radial basis function neural network classifier trained on the PhyChem features of intra-host HVR1 variants accurately classified R/C-HVR1 variants (classification accuracy (CA) = 94.85%; area under the ROC curve, AUROC = 0.979), in 10-fold cross-validation). The classifier was accurate in assigning individual HVR1 variants to R/C-classes in the testing set (CA = 84.15%; AUROC = 0.912) and in detection of infection duration (R/C-class) in patients (CA = 88.45%). Statistical tests and evaluation of the classifier on randomly-labeled datasets indicate that classifiers’ CA is robust (p < 0.001) and unlikely due to random correlations (CA = 59.04% and AUROC = 0.50). Conclusions The PhyChem features of intra-host HVR1 variants are strongly associated with the duration of HCV infection. Application of the PhyChem biomarkers to models for detection of the R/C-state of HCV infection in patients offers a new opportunity for detection of outbreaks and for molecular surveillance. The method will be available at https://webappx.cdc.gov/GHOST/ to the authenticated users of Global Hepatitis Outbreak and Surveillance Technology (GHOST) for further testing and validation.http://link.springer.com/article/10.1186/s12864-017-4269-2 |
spellingShingle | James Lara Mahder Teka Yury Khudyakov Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier BMC Genomics |
title | Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
title_full | Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
title_fullStr | Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
title_full_unstemmed | Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
title_short | Identification of recent cases of hepatitis C virus infection using physical-chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
title_sort | identification of recent cases of hepatitis c virus infection using physical chemical properties of hypervariable region 1 and a radial basis function neural network classifier |
url | http://link.springer.com/article/10.1186/s12864-017-4269-2 |
work_keys_str_mv | AT jameslara identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier AT mahderteka identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier AT yurykhudyakov identificationofrecentcasesofhepatitiscvirusinfectionusingphysicalchemicalpropertiesofhypervariableregion1andaradialbasisfunctionneuralnetworkclassifier |