Describing full binary trees with a context-free language

Today I describe a type of graph (full binary tree) using formal language theory, and in particular, a context-free language. The purpose of this exercise is to u

Let's recall some graph theory. A binary tree is a directed acyclic graph (DAG) where each node has no more than two children. A full (or proper) binary tree is a tree where each non-terminal node has two children. In other words, each node either has 0 (the terminals or leafs) or two children. The node which is not a child of any other node is referred to as the root. Above you can see an example of a full binary tree that was created in Matlab.
Another notion that we need that comes from graph theory (and like binary trees is used a lot in data structures in computer science) is the preorder traversal. A traversal is a particular way of visiting the nodes of a tree. And the preorder traversal is when one begins with the root of the tree and then moves downward from left-to-right. A sample implementation of this is:
preorder(node)But this can be clearly seen in the following figure.
print node.value
if node.left ≠ null then preorder(node.left)
if node.right ≠ null then preorder(node.right)
And now let's recall some formal language theory. A formal language is set of strings (finite or infinite) of symbols with precise rules (often recursive) for generating those strings. A context-free language is a particular type of formal language. Formally, a context-free language (CFG) is a 4-tuple (V, Σ, R, S) where:
- V is a finite set called the variables ;
- Σ is a finite set, V ∩ Σ = ∅, called the terminals;
- R is a finite set of productions (or rules), with each production being composed of a variable and a string of variables and terminals, i.e. α → A, where A ∈ (V∪Σ)*, where '*' is Kleene (or star) operation: for any set A, A* = {x1x2…xk| k≥0 ∧ ∀i≤k xi∈A} ;
- S ∈ V is the start variable.
GFBT={{α},{L, R}, R, α},
where the productions R are given by (where '|' represents the disjunction) and ε represents the empty string:α → LαR|LRα|LαRα|ε.
Following Chomsky and Miller ["Introduction to the formal analysis of natural languages", Handbook of Mathematical Psychology: Volume II, 1963, pp. 290, 293] we can refer to the first production as left-recursive, the second as right-recursive, and the third as self-embedding. Naturally we can think of the 'L' symbol as representing left and the 'R' symbol representing right.
At a later point I will tie this into a particular application in mathematical psychology, viz. with multinomial processing tree models.
Following Chomsky and Miller ["Introduction to the formal analysis of natural languages", Handbook of Mathematical Psychology: Volume II, 1963, pp. 290, 293] we can refer to the first production as left-recursive, the second as right-recursive, and the third as self-embedding. Naturally we can think of the 'L' symbol as representing left and the 'R' symbol representing right.
In order to see how this works, let's use this grammar's productions to generate the string the represents the above tree (the same one we used to demonstrate the preorder traversal). Note that since the language is context-free, we don't have to replace alpha with the same string each time:
And so now we can use this language to generate a string that represents any full binary tree with the preorder traversal. α → LαRα →LLRRα →LLRRLαRα→LLRRLLRRLR.
At a later point I will tie this into a particular application in mathematical psychology, viz. with multinomial processing tree models.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home