Representation of Program Text

Program text is accessible at the meta level in the form of parse tree. The parse tree is represented by a Ptree metaobject. It is implemented as a nested linked-list of lexical tokens --- the S expressions in the Lisp terminology. For example, this piece of code:

int a = b + c * 2;

is parsed into:

[[static] [int] [[a = [b + [c * 2]]]] ;]

Here, [] denotes a linked list. Note that operators such as = and + make sublists. The sublists and their elements (that is, lexical tokens such as a and =) are also represented by Ptree metaobjects.

Public Members

Basic Operations

To manipulate linked lists, the MOP provides many static member functions on Ptree, which are familiar to Lisp programmers:

Furthermore, the following member functions are available on Ptree metaobjects:

The parse tree is basically a long list of the lexical tokens that appear in the program although some of them are grouped into sublists. The order of the elements of that list is the same as the order in which the lexical tokens appear. But if some fields such as the type field are omitted in the program, then nil is inserted at those places. For example, if the return type of a function declaration is omitted as follows:

main(int argc, char** argv){ }

then nil list is inserted at the head of the list:

[nil nil [main ( [[[int] [argc]] , [[char] [* * argv]]] )] [{

Since the function body is also omitted, nil list is inserted between { and }.


Programmers can make Ptree metaobjects. Because the MOP provides a conservative garbage collector, they don't need to care about deallocation of the metaobjects. The next static member functions on Ptree are used to make a Ptree metaobjects.

The Ptree metaobject returned by Make() is not a real parse tree.(At least, for the time being.)It is just a unparsed chunk of characters. Although programmers can use Ptree metaobjects generated by Make() as they use other Ptree metaobjects, the structure of those metaobjects does not reflect the code they represent.

Using Make(), programmers can easily generate any piece of code to substitute for part of the original source code. For example, suppose array_name is xpos and offset is 3. The following function call:

Ptree::Make("%p[%d]", array_name, offset)

makes a Ptree metaobject that represents:


%p simply expand a given Ptree metaobject as a character string. Thus programmers may write something like:

Ptree::Make("char* GetName(){ return \"%p\"; }",

Note that a double quote " must be escaped by a backslash \ in a C++ string. \"%p\" makes a string literal. The function call above generates the code below:

char* GetName(){ return "xpos"; }

Although Make() follows the old printf() style, programmers can also use a more convenient style similar to Lisp's backquote notation. For example,

Ptree::Make("%p[%d]", array_name, offset)

The expression above can be rewritten using qMake() as follows:


Note that the ``backquoted'' C++ expressions array_name and offset are directly embedded in the C++ string. Their occurrence are replaced with the value of the expression. This replacement cannot be implemented in regular C++. It is implemented by the metaclass for Ptree.

Except the difference in the notation, qMake() is equivalent to Make(). Programmers can choose either one they prefer at any place.

Pattern Matching

The MOP provides a static member function on Ptree metaobjects for pattern matching.

For example, the function Match() is used as follows:

if(Ptree::Match(expr, "[%? + %?]", &lexpr, &rexpr))
    cout << "this is an addition.";
else if(Ptree::Match(expr, "[%? - %?]", &lexpr, &rexpr))
    cout << "this is a subtraction.";
    cout << "unknown";

The pattern [%? + %?] matches a linked list that consists of three elements if the second one is +. If an expression expr matches the pattern, lexpr gets bound to the first element of expr and rexpr gets bound to the third element.

The pattern is a null-terminated string. Since Match() does not understand the C++ grammar, lexical tokens appearing in the pattern must be separated by a white space. For example, a pattern a+b is regarded as a single token. The pattern is constructed by these rules:

  1. A word (characters terminated by a white space) is a pattern that matches a lexical token.
  2. %[, %], and %% are patterns that match [, ], and %.
  3. [] is a pattern that matches a null list (nil).
  4. [pat1 pat2 ... ] is a pattern that matches a list of pat1, pat2, ...
  5. %* is a pattern that matches any token or list.
  6. %? is a pattern that matches any token or list. The matched token or list is bound to sublist.
  7. %_ is a pattern that matches the rest of the list (the cdr part).
  8. %r is a pattern that matches the rest of the list. The matched list is bound to sublist.

Reifying Program Text

If a Ptree metaobject represents a literal such as an integer constant and a string literal, we can obtain the value denoted by the literal.

Note: the character string returned by Reify() is allocated in the heap area. However, because the MOP provides a conservative garbage collector, programmers do not need to deallocate the string by themselves.

Support Classes

The MOP provides two support classes PtreeIter and PtreeArray to help programmers to deal with Ptree objects. PtreeIter is useful to perform iteration on a list of Ptree objects. Suppose that expr is a list:

PtreeIter next(expr);
Ptree* p;
while((p = next()) != nil){
    // compute on p

Each element of expr is bound to p one at a time. The operator () on PtreeIter objects returns the next element. Programmers may call Pop() instead of the operator (). Since the two functions are equivalent, the program above can be rewritten to be:

PtreeIter next(expr);
Ptree* p;
while((p = next.Pop()) != nil){
    // compute on p

If the reader prefers the for-loop style, she may also say:

for(PtreeIter i = expr; !i.Empty(); i++){
    // compute on *i

Although this interface is slightly slower, it distinguishes the end of the list and a nil element. If expr includes nil, Pop() cannot correctly detect the end of the list.

Another support class is PtreeArray for dealing with an unbounded array of Ptree objects. It is used as follows (suppose that expr is a Ptree object):

PtreeArray a;            // allocate an array
a.Append(expr);          // append expr to the end of the array
Ptree* p = a[0];         // get the first element
Ptree* p2 = a.Ref(0);    // same as a[0]
int n = a.Number();      // get the number of elements
Ptree* lst = a.All();    // get a list of all the elements
a.Clear();               // make the array empty

[First | Prev | Next]