Binary tree


In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the left child and the right child. That is, it is a k-ary tree where. A recursive definition using set theory is that a binary tree is a triple, where L and R are binary trees or the empty set and S is a singleton containing the root.
From a graph theory perspective, binary trees as defined here are arborescences. A binary tree may thus be also called a bifurcating arborescence, a term which appears in some early programming books before the modern computer science terminology prevailed. It is also possible to interpret a binary tree as an undirected, rather than directed graph, in which case a binary tree is an ordered, rooted tree. Some authors use rooted binary tree instead of binary tree to emphasize the fact that the tree is rooted, but as defined above, a binary tree is always rooted.
In mathematics, what is termed binary tree can vary significantly from author to author. Some use the definition commonly used in computer science, but others define it as every non-leaf having exactly two children and don't necessarily label the children as left and right either.
In computing, binary trees can be used in two very different ways:
  • First, as a means of accessing nodes based on some value or label associated with each node. Binary trees labelled this way are used to implement binary search trees and binary heaps, and are used for efficient searching and sorting. The designation of non-root nodes as left or right child even when there is only one child present matters in some of these applications, in particular, it is significant in binary search trees. However, the arrangement of particular nodes into the tree is not part of the conceptual information. For example, in a normal binary search tree the placement of nodes depends almost entirely on the order in which they were added, and can be re-arranged without changing the meaning.
  • Second, as a representation of data with a relevant bifurcating structure. In such cases, the particular arrangement of nodes under and/or to the left or right of other nodes is part of the information. Common examples occur with Huffman coding and cladograms. The everyday division of documents into chapters, sections, paragraphs, and so on is an analogous example with n-ary rather than binary trees.

    Definitions

Recursive definition

Recursive full tree definition

A simple, informal approach to describe a binary tree might be as follows:
More formally:
  • There exists a full tree consisting of a single node;
  • If T1 and T2 are full binary trees, which do not share any node, and r is a node not belonging to T1 or T2, then the ordered triple is a full binary tree.
This definition implies two limitations: a full binary tree contains at least one node, and no node can have only one child. This is resolved with the next definition.

Recursive extended tree definition

The extended tree definition starts with the assumption the tree can be empty.
  • An empty set of nodes is an extended binary tree.
  • If T1 and T2 are extended binary trees, which do not share any node, and r is a node not belonging to any of them, then the ordered triple is an extended binary tree.
To be complete from the graph point of view, both definitions of a tree should be extended by a definition of the corresponding branches sets. Informally, the branches set can be described as a set of all ordered pairs of nodes where is a root node of any subtree appearing in the definition, and is a root node of any of its T1 and T2 subtrees.
Another way of imagining this construction is to consider instead of the empty set a different type of node—for instance square nodes if the regular ones are circles.

Using graph theory concepts

A binary tree is a rooted tree that is also an ordered tree in which every node has at most two children. A rooted tree naturally imparts a notion of levels ; thus, for every node, a notion of children may be defined as the nodes connected to it a level below. Ordering of these children makes it possible to distinguish a left child from a right child. But this still does not distinguish between a node with left but not a right child from a node with right but no left child.
The necessary distinction can be made by first partitioning the edges; i.e., defining the binary tree as triplet, where is a rooted tree and E1 ∩ E2 is empty, and also requiring that for all j ∈, every node has at most one Ej child. A more informal way of making the distinction is to say, quoting the Encyclopedia of Mathematics, that "every node has a left child, a right child, neither, or both" and to specify that these "are all different" binary trees.

Types of binary trees

Tree terminology is not well-standardized and therefore may vary among examples in the available literature.
  • A ' binary tree has a root node and every node has at most two children.
  • A ' binary tree is a tree in which every node has either 0 or 2 children. Another way of defining a full binary tree is a recursive definition. A full binary tree is either:
  • * A single vertex.
  • * A tree whose root node has two subtrees, both of which are full binary trees.
  • A ' binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level. A perfect binary tree is a full binary tree.
  • A ' binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2h nodes at the last level h. A perfect tree is therefore always complete but a complete tree is not always perfect. Some authors use the term complete to refer instead to a perfect binary tree as defined above, in which case they call this type of tree an almost complete binary tree or nearly complete binary tree. A complete binary tree can be efficiently represented using an array.
  • The infinite complete binary tree is a tree with levels, where for each level d the number of existing nodes at level d is equal to 2d. The cardinal number of the set of all levels is . The cardinal number of the set of all paths is uncountable, having the cardinality of the continuum.
  • A balanced binary tree is a binary tree structure in which the left and right subtrees of every node differ in height by no more than 1. One may also consider binary trees where no leaf is much farther away from the root than any other leaf.
  • A degenerate tree is where each parent node has only one associated child node. This means that the tree will behave like a linked list data structure. In this case, an advantage of using a binary tree is significantly reduced because it is essentially a linked list which time complexity is O and it has more data space than the linked list due to two pointers per node, while the complexity of for data search in a balanced binary tree is normally expected.

    Properties of binary trees

  • The number of nodes in a full binary tree is at least and at most , where is the height of the tree. A tree consisting of only a root node has a height of 0. The least number of nodes is obtained by adding only two children nodes per adding height so . The maximum number of nodes is obtained by fully filling nodes at each level, i.e., it is a perfect tree. For a perfect tree, the number of nodes is, where the last equality is from the geometric series sum.
  • The number of leaf nodes in a perfect binary tree is because and the number of leaves is so. It also means that. In terms of the tree height,.
  • For any non-empty binary tree with leaf nodes and nodes of degree 2,. The proof is the following. For a perfect binary tree, the total number of nodes is and, so. To make a full binary tree from a perfect binary tree, a pair of two sibling nodes are removed one by one. This results in "two leaf nodes removed" and "one internal node removed" and "the removed internal node becoming a leaf node", so one leaf node and one internal node is removed per removing two sibling nodes. As a result, also holds for a full binary tree. To make a binary tree with a leaf node without its sibling, a single leaf node is removed from a full binary tree, then "one leaf node removed" and "one internal nodes with two children removed" so also holds. This relation now covers all non-empty binary trees.
  • With given nodes, the minimum possible tree height is with which the tree is a balanced full tree or perfect tree. With a given height, the number of nodes can't exceed the as the number of nodes in a perfect tree. Thus.
  • A binary tree with leaves has at least the height. With a given height, the number of leaves at that height can't exceed as the number of leaves at the height in a perfect tree. Thus.
  • In a non-empty binary tree, if is the total number of nodes and is the total number of edges, then. This is obvious because each node requires one edge except for the root node.
  • The number of null links in a binary tree of nodes is.
  • The number of internal nodes in a complete binary tree of nodes is.

    Combinatorics

In combinatorics, one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes, and trees are distinguished only by their structure; however, the left and right child of any node are distinguished. The size of the tree is taken to be the number n of internal nodes ; the other nodes are leaf nodes and there are of them. The number of such binary trees of size n is equal to the number of ways of fully parenthesizing a string of symbols separated by n binary operators, to determine the argument subexpressions of each operator. For instance for one has to parenthesize a string like, which is possible in five ways:
The correspondence to binary trees should be obvious, and the addition of redundant parentheses is disallowed.
There is a unique binary tree of size 0, and any other binary tree is characterized by the pair of its left and right children; if these have sizes i and j respectively, the full tree has size. Therefore, the number of binary trees of size n has the following recursive description, and for any positive integer n. It follows that is the Catalan number of index n.
The above parenthesized strings should not be confused with the set of words of length 2n in the Dyck language, which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description ; this number is therefore also the Catalan number. So there are also five Dyck words of length 6:
These Dyck words do not correspond to binary trees in the same way. Instead, they are related by the following recursively defined bijection: the Dyck word equal to the empty string corresponds to the binary tree of size 0 with only one leaf. Any other Dyck word can be written as, where, are themselves Dyck words and where the two written parentheses are matched. The bijection is then defined by letting the words and correspond to the binary trees that are the left and right children of the root.
A bijective correspondence can also be defined as follows: enclose the Dyck word in an extra pair of parentheses, so that the result can be interpreted as a Lisp list expression ; then the dotted-pair expression for that proper list is a fully parenthesized expression describing the corresponding binary tree.
The ability to represent binary trees as strings of symbols and parentheses implies that binary trees can represent the elements of a free magma on a singleton set.