Subject: One more
From: Andrei Alexandrescu (andrei@cs.washington.edu)
Date: Sat Oct 06 2001 - 11:59:28 PDT
Sorry for the spam, but something occured to me just now. I guess somehow
I'm rehashing the obvious, but I'm not sure. So here goes.
It occured to me that a regular expression is a type. That is, you can
express any regular expression (and thus the associated grammar) as a type.
Consider the following parameterized types. In parenthesis I specify the
values or the types with which the types are parameterized (don't confuse
those with value constructors, they parameterize the types, not the value of
the type). They are all static subclasses of the class RegularExpression.
Literal[char] (one class for each character)
Range[char, char]
Concat[RegularExpression, RegularExpression]
Or[RegularExpression]
Not[RegularExpression]
Optional[RegularExpression]
ZeroOrMore[RegularExpression]
Optional
Now, given these basic parameterized types, any regular expression is a type
constructed with them. For example:
* Simple real numbers: "[+\-][0-9].[0-9]" ->
Concat[Concat[Concat[Or[Literal['+'], Literal['-']], Range['0', '9']], '.'],
Range['0', '9']]
The banana above is a type that contains all information about the regexp
encoded inside of it. If it's a type, it means that type-specific anaylisis
can be applied to it. In particular, building a lexer this way is different
from the classic (automata-driven) solution. What's the advantage of doing
that, I don't know. I'm sure the approach is sound, but what I don't know is
whether it's interesting.
Now context-free grammars are types as well. They can be built out of
primitive types (the non-terminals). For example, "If" is a type
parameterized with the test and two Statements (one of which can be the Nil
type). The pl0 interpreter is built this way except that If is a class
holding pointers to Condition, Statement and Statement, instead of being
parameterized with them. In other words, the pl0 compiler is "dynamically
typed" with respect to the grammar it implements.
In fact, a program is a type. Think of the lexical tree as a big compound
type built of all the types corresponding to various statements. The leaves
are simple types; all others are parameterized types.
Do I make any sense?
I found it interesting that grammars and programs can be see as types, maybe
there is something behind that. Let me know if I just made a fool of myself
by restating something that's obvious and uninteresting.
Andrei
_______________________________________________
Cecil mailing list
Cecil@cs.washington.edu
http://majordomo.cs.washington.edu/mailman/listinfo/cecil
This archive was generated by hypermail 2b25 : Sat Oct 06 2001 - 12:02:03 PDT