Information Entropy of Influenza A Segment 7

Information entropy (H) is a measure of uncertainty at each position within in a sequence of nucleotides.H was used to characterize a set of influenza A segment 7 nucleotide sequences. Nucleotide locations of high entropy were identified near the 5’ start of all of the sequences and the sequenc...

Full description

Bibliographic Details
Main Authors: Joel K. Weltman, Shaohua Fan, William A. Thompson
Format: Article
Language:English
Published: MDPI AG 2008-11-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/10/4/736/
_version_ 1798005755861794816
author Joel K. Weltman
Shaohua Fan
William A. Thompson
author_facet Joel K. Weltman
Shaohua Fan
William A. Thompson
author_sort Joel K. Weltman
collection DOAJ
description Information entropy (H) is a measure of uncertainty at each position within in a sequence of nucleotides.H was used to characterize a set of influenza A segment 7 nucleotide sequences. Nucleotide locations of high entropy were identified near the 5’ start of all of the sequences and the sequences were assigned to subsets according to synonymous nucleotide variants at those positions: either uracil at position six (U6), cytosine at position six (C6), adenine (A12) at position 12, guanine at position 12 (G12), adenine at position 15 (A15) or cytosine (C15) at position 15. H values were found to be correlated/corresponding (Kendall tau) along the lengths of the nucleotide segments of the subset pairs at each position. However, the H values of each subset of sequences were statistically distinguishable from those of the other member of the pair (Kolmogorov-Smirnov test). The joint probability of uncorrelated distributions of U6 and C6 sequences to viral subtypes and to viral host species was 34 times greater than for the A12:G12 subset pair and 214 times greater than for the A15:C15 pair. This result indicates that the high entropy position six of segment 7 is either a reporter or a sentinel location. The fact that not one of the H5N1 sequences in the dataset was a member of the C6 subset, but all 125 H5N1 sequences are members of the U6 subset suggests a non-random sentinel function.
first_indexed 2024-04-11T12:44:08Z
format Article
id doaj.art-52cb8498c3a74a4287da22c381ebf842
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-11T12:44:08Z
publishDate 2008-11-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-52cb8498c3a74a4287da22c381ebf8422022-12-22T04:23:24ZengMDPI AGEntropy1099-43002008-11-0110473674410.3390/e10040736Information Entropy of Influenza A Segment 7Joel K. WeltmanShaohua FanWilliam A. ThompsonInformation entropy (H) is a measure of uncertainty at each position within in a sequence of nucleotides.H was used to characterize a set of influenza A segment 7 nucleotide sequences. Nucleotide locations of high entropy were identified near the 5’ start of all of the sequences and the sequences were assigned to subsets according to synonymous nucleotide variants at those positions: either uracil at position six (U6), cytosine at position six (C6), adenine (A12) at position 12, guanine at position 12 (G12), adenine at position 15 (A15) or cytosine (C15) at position 15. H values were found to be correlated/corresponding (Kendall tau) along the lengths of the nucleotide segments of the subset pairs at each position. However, the H values of each subset of sequences were statistically distinguishable from those of the other member of the pair (Kolmogorov-Smirnov test). The joint probability of uncorrelated distributions of U6 and C6 sequences to viral subtypes and to viral host species was 34 times greater than for the A12:G12 subset pair and 214 times greater than for the A15:C15 pair. This result indicates that the high entropy position six of segment 7 is either a reporter or a sentinel location. The fact that not one of the H5N1 sequences in the dataset was a member of the C6 subset, but all 125 H5N1 sequences are members of the U6 subset suggests a non-random sentinel function.http://www.mdpi.com/1099-4300/10/4/736/Influenzainformation entropysegment 7subtypeshostssynonymous mutations
spellingShingle Joel K. Weltman
Shaohua Fan
William A. Thompson
Information Entropy of Influenza A Segment 7
Entropy
Influenza
information entropy
segment 7
subtypes
hosts
synonymous mutations
title Information Entropy of Influenza A Segment 7
title_full Information Entropy of Influenza A Segment 7
title_fullStr Information Entropy of Influenza A Segment 7
title_full_unstemmed Information Entropy of Influenza A Segment 7
title_short Information Entropy of Influenza A Segment 7
title_sort information entropy of influenza a segment 7
topic Influenza
information entropy
segment 7
subtypes
hosts
synonymous mutations
url http://www.mdpi.com/1099-4300/10/4/736/
work_keys_str_mv AT joelkweltman informationentropyofinfluenzaasegment7
AT shaohuafan informationentropyofinfluenzaasegment7
AT williamathompson informationentropyofinfluenzaasegment7