On Correlation Polynomials and Subword Complexity

We consider words with letters from a $q-ary$ alphabet $\mathcal{A}$. The kth subword complexity of a word $w ∈\mathcal{A}^*$ is the number of distinct subwords of length $k$ that appear as contiguous subwords of $w$. We analyze subword complexity from both combinatorial and probabilistic viewpoints...

Full description

Bibliographic Details
Main Authors: Irina Gheorghiciuc, Mark Daniel Ward
Format: Article
Language:English
Published: Discrete Mathematics & Theoretical Computer Science 2007-01-01
Series:Discrete Mathematics & Theoretical Computer Science
Subjects:
Online Access:https://dmtcs.episciences.org/3553/pdf
_version_ 1827324073508601856
author Irina Gheorghiciuc
Mark Daniel Ward
author_facet Irina Gheorghiciuc
Mark Daniel Ward
author_sort Irina Gheorghiciuc
collection DOAJ
description We consider words with letters from a $q-ary$ alphabet $\mathcal{A}$. The kth subword complexity of a word $w ∈\mathcal{A}^*$ is the number of distinct subwords of length $k$ that appear as contiguous subwords of $w$. We analyze subword complexity from both combinatorial and probabilistic viewpoints. Our first main result is a precise analysis of the expected $kth$ subword complexity of a randomly-chosen word $w ∈\mathcal{A}^n$. Our other main result describes, for $w ∈\mathcal{A}^*$, the degree to which one understands the set of all subwords of $w$, provided that one knows only the set of all subwords of some particular length $k$. Our methods rely upon a precise characterization of overlaps between words of length $k$. We use three kinds of correlation polynomials of words of length $k$: unweighted correlation polynomials; correlation polynomials associated to a Bernoulli source; and generalized multivariate correlation polynomials. We survey previously-known results about such polynomials, and we also present some new results concerning correlation polynomials.
first_indexed 2024-04-25T02:04:02Z
format Article
id doaj.art-6ffb37e7bdc542479f4a7b318292dbca
institution Directory Open Access Journal
issn 1365-8050
language English
last_indexed 2024-04-25T02:04:02Z
publishDate 2007-01-01
publisher Discrete Mathematics & Theoretical Computer Science
record_format Article
series Discrete Mathematics & Theoretical Computer Science
spelling doaj.art-6ffb37e7bdc542479f4a7b318292dbca2024-03-07T14:34:52ZengDiscrete Mathematics & Theoretical Computer ScienceDiscrete Mathematics & Theoretical Computer Science1365-80502007-01-01DMTCS Proceedings vol. AH,...Proceedings10.46298/dmtcs.35533553On Correlation Polynomials and Subword ComplexityIrina Gheorghiciuc0Mark Daniel Ward1Department of Mathematical SciencesDepartment of Mathematics [Philadelphia]We consider words with letters from a $q-ary$ alphabet $\mathcal{A}$. The kth subword complexity of a word $w ∈\mathcal{A}^*$ is the number of distinct subwords of length $k$ that appear as contiguous subwords of $w$. We analyze subword complexity from both combinatorial and probabilistic viewpoints. Our first main result is a precise analysis of the expected $kth$ subword complexity of a randomly-chosen word $w ∈\mathcal{A}^n$. Our other main result describes, for $w ∈\mathcal{A}^*$, the degree to which one understands the set of all subwords of $w$, provided that one knows only the set of all subwords of some particular length $k$. Our methods rely upon a precise characterization of overlaps between words of length $k$. We use three kinds of correlation polynomials of words of length $k$: unweighted correlation polynomials; correlation polynomials associated to a Bernoulli source; and generalized multivariate correlation polynomials. We survey previously-known results about such polynomials, and we also present some new results concerning correlation polynomials.https://dmtcs.episciences.org/3553/pdfcombinatorics on wordsaverage-case analysisautocorrelationasymptoticsanalytic methodscorrelation polynomialde \\bruijn graphdepthsubword complexitysuffix trees.[info.info-ds] computer science [cs]/data structures and algorithms [cs.ds][info.info-dm] computer science [cs]/discrete mathematics [cs.dm][math.math-co] mathematics [math]/combinatorics [math.co][info.info-cg] computer science [cs]/computational geometry [cs.cg]
spellingShingle Irina Gheorghiciuc
Mark Daniel Ward
On Correlation Polynomials and Subword Complexity
Discrete Mathematics & Theoretical Computer Science
combinatorics on words
average-case analysis
autocorrelation
asymptotics
analytic methods
correlation polynomial
de \\bruijn graph
depth
subword complexity
suffix trees.
[info.info-ds] computer science [cs]/data structures and algorithms [cs.ds]
[info.info-dm] computer science [cs]/discrete mathematics [cs.dm]
[math.math-co] mathematics [math]/combinatorics [math.co]
[info.info-cg] computer science [cs]/computational geometry [cs.cg]
title On Correlation Polynomials and Subword Complexity
title_full On Correlation Polynomials and Subword Complexity
title_fullStr On Correlation Polynomials and Subword Complexity
title_full_unstemmed On Correlation Polynomials and Subword Complexity
title_short On Correlation Polynomials and Subword Complexity
title_sort on correlation polynomials and subword complexity
topic combinatorics on words
average-case analysis
autocorrelation
asymptotics
analytic methods
correlation polynomial
de \\bruijn graph
depth
subword complexity
suffix trees.
[info.info-ds] computer science [cs]/data structures and algorithms [cs.ds]
[info.info-dm] computer science [cs]/discrete mathematics [cs.dm]
[math.math-co] mathematics [math]/combinatorics [math.co]
[info.info-cg] computer science [cs]/computational geometry [cs.cg]
url https://dmtcs.episciences.org/3553/pdf
work_keys_str_mv AT irinagheorghiciuc oncorrelationpolynomialsandsubwordcomplexity
AT markdanielward oncorrelationpolynomialsandsubwordcomplexity