Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees
A binary search tree can be used to store data in a computer system for retrieval by name. Different elements in the tree may be referenced with different probabilities. If we define the cost of the tree as the average number of elements which must be examined in searching for an element, then diffe...
Main Author: | |
---|---|
Other Authors: | |
Published: |
2023
|
Online Access: | https://hdl.handle.net/1721.1/148897 |
_version_ | 1811097344450494464 |
---|---|
author | Bayer, Paul J. |
author2 | Rivest, Ronald L. |
author_facet | Rivest, Ronald L. Bayer, Paul J. |
author_sort | Bayer, Paul J. |
collection | MIT |
description | A binary search tree can be used to store data in a computer system for retrieval by name. Different elements in the tree may be referenced with different probabilities. If we define the cost of the tree as the average number of elements which must be examined in searching for an element, then different trees have different costs. We show that two particular types of trees, weight balanced trees and min-max trees, which are easily constructed from the probability distribution on the elements, are close to optimal. Specifically, we show that for any probability distribution with entropy H, H-log2H-(log2e-1)<=Copt<= Cwb ,+ H+2/Cmm,+H+2 where Copt, Cwb, and Cmm are the optimal, weigh balances, and min-max costs. We gain some added insight by deriving an expression for the expected value of the entropy of a random probability distribution. |
first_indexed | 2024-09-23T16:58:06Z |
id | mit-1721.1/148897 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T16:58:06Z |
publishDate | 2023 |
record_format | dspace |
spelling | mit-1721.1/1488972023-03-30T03:02:03Z Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees Bayer, Paul J. Rivest, Ronald L. A binary search tree can be used to store data in a computer system for retrieval by name. Different elements in the tree may be referenced with different probabilities. If we define the cost of the tree as the average number of elements which must be examined in searching for an element, then different trees have different costs. We show that two particular types of trees, weight balanced trees and min-max trees, which are easily constructed from the probability distribution on the elements, are close to optimal. Specifically, we show that for any probability distribution with entropy H, H-log2H-(log2e-1)<=Copt<= Cwb ,+ H+2/Cmm,+H+2 where Copt, Cwb, and Cmm are the optimal, weigh balances, and min-max costs. We gain some added insight by deriving an expression for the expected value of the entropy of a random probability distribution. 2023-03-29T14:06:24Z 2023-03-29T14:06:24Z 1975-11 https://hdl.handle.net/1721.1/148897 02191962 MIT-LCS-TM-069 MAC-TM-69 application/pdf |
spellingShingle | Bayer, Paul J. Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title | Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title_full | Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title_fullStr | Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title_full_unstemmed | Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title_short | Improved Bounds on the Costs of Optimal and Balanced Binary Search Trees |
title_sort | improved bounds on the costs of optimal and balanced binary search trees |
url | https://hdl.handle.net/1721.1/148897 |
work_keys_str_mv | AT bayerpaulj improvedboundsonthecostsofoptimalandbalancedbinarysearchtrees |