Enumeration of Binary Trees and Universal Types

Binary unlabeled ordered trees (further called binary trees) were studied at least since Euler, who enumerated them. The number of such trees with n nodes is now known as the Catalan number. Over the years various interesting questions about the statistics of such trees were investigated (e.g.,...

Full description

Bibliographic Details
Main Authors: Charles Knessl, Wojciech Szpankowski
Format: Article
Language:English
Published: Discrete Mathematics & Theoretical Computer Science 2005-12-01
Series:Discrete Mathematics & Theoretical Computer Science
Online Access:http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/607
_version_ 1818211335003963392
author Charles Knessl
Wojciech Szpankowski
author_facet Charles Knessl
Wojciech Szpankowski
author_sort Charles Knessl
collection DOAJ
description Binary unlabeled ordered trees (further called binary trees) were studied at least since Euler, who enumerated them. The number of such trees with n nodes is now known as the Catalan number. Over the years various interesting questions about the statistics of such trees were investigated (e.g., height and path length distributions for a randomly selected tree). Binary trees find an abundance of applications in computer science. However, recently Seroussi posed a new and interesting problem motivated by information theory considerations: how many binary trees of a given path length (sum of depths) are there? This question arose in the study of universal types of sequences. Two sequences of length p have the same universal type if they generate the same set of phrases in the incremental parsing of the Lempel-Ziv'78 scheme since one proves that such sequences converge to the same empirical distribution. It turns out that the number of distinct types of sequences of length p corresponds to the number of binary (unlabeled and ordered) trees, T p, of given path length p (and also the number of distinct Lempel-Ziv'78 parsings of length p sequences). We first show that the number of binary trees with given path length p is asymptotically equal to T p 2 2p/(log 2 p)(1+O(log-2/3 p)). Then we establish various limiting distributions for the number of nodes (number of phrases in the Lempel-Ziv'78 scheme) when a tree is selected randomly among all trees of given path length p. Throughout, we use methods of analytic algorithmics such as generating functions and complex asymptotics, as well as methods of applied mathematics such as the WKB method and matched asymptotics.
first_indexed 2024-12-12T05:30:51Z
format Article
id doaj.art-73981afcceb54722a995c86505ea86ed
institution Directory Open Access Journal
issn 1462-7264
1365-8050
language English
last_indexed 2024-12-12T05:30:51Z
publishDate 2005-12-01
publisher Discrete Mathematics & Theoretical Computer Science
record_format Article
series Discrete Mathematics & Theoretical Computer Science
spelling doaj.art-73981afcceb54722a995c86505ea86ed2022-12-22T00:36:19ZengDiscrete Mathematics & Theoretical Computer ScienceDiscrete Mathematics & Theoretical Computer Science1462-72641365-80502005-12-0171Enumeration of Binary Trees and Universal TypesCharles KnesslWojciech SzpankowskiBinary unlabeled ordered trees (further called binary trees) were studied at least since Euler, who enumerated them. The number of such trees with n nodes is now known as the Catalan number. Over the years various interesting questions about the statistics of such trees were investigated (e.g., height and path length distributions for a randomly selected tree). Binary trees find an abundance of applications in computer science. However, recently Seroussi posed a new and interesting problem motivated by information theory considerations: how many binary trees of a given path length (sum of depths) are there? This question arose in the study of universal types of sequences. Two sequences of length p have the same universal type if they generate the same set of phrases in the incremental parsing of the Lempel-Ziv'78 scheme since one proves that such sequences converge to the same empirical distribution. It turns out that the number of distinct types of sequences of length p corresponds to the number of binary (unlabeled and ordered) trees, T p, of given path length p (and also the number of distinct Lempel-Ziv'78 parsings of length p sequences). We first show that the number of binary trees with given path length p is asymptotically equal to T p 2 2p/(log 2 p)(1+O(log-2/3 p)). Then we establish various limiting distributions for the number of nodes (number of phrases in the Lempel-Ziv'78 scheme) when a tree is selected randomly among all trees of given path length p. Throughout, we use methods of analytic algorithmics such as generating functions and complex asymptotics, as well as methods of applied mathematics such as the WKB method and matched asymptotics.http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/607
spellingShingle Charles Knessl
Wojciech Szpankowski
Enumeration of Binary Trees and Universal Types
Discrete Mathematics & Theoretical Computer Science
title Enumeration of Binary Trees and Universal Types
title_full Enumeration of Binary Trees and Universal Types
title_fullStr Enumeration of Binary Trees and Universal Types
title_full_unstemmed Enumeration of Binary Trees and Universal Types
title_short Enumeration of Binary Trees and Universal Types
title_sort enumeration of binary trees and universal types
url http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/607
work_keys_str_mv AT charlesknessl enumerationofbinarytreesanduniversaltypes
AT wojciechszpankowski enumerationofbinarytreesanduniversaltypes