Subject: publishing
From: Keunwoo Lee (klee@cs.washington.edu)
Date: Mon May 28 2001 - 00:22:41 PDT
----------------------------------------------------------------------
SUMMARY
Major changes:
+ Did fixed array lowerings.
+ Began putting in memory allocation support
+ Changed semantics (and parsing) of "new"
+ Changed DFG and cfg2dfg.cecil to remove AST dependencies in various
LValue nodes.
+ Changed codegen and the WIL stdlib so that we are closer to
compiling. (Actually, we are very close to compiling
trivial programs, there's just some makefile junk to fix.)
+ Tweaks in graph building framework.
Status: towers and tests/test-base.wil build without crashing, on o0
and o1. Output for the array test case nearly passes GNU make.
Details follow. This changelog is long not because the changes are
huge, but because some of them are subtle or potentially confusing.
If you're in a hurry, skip to the parts about "new" semantics and the
infrastructure changes, since those affect the rest of you the most.
----------------------------------------------------------------------
LOWERING PROGRESS
+ Fixed array lowering is done, tested, and working. towers and the
Java stdlib don't appear to use these, but I started with them because
I thought they'd be easier than dynamic arrays. Further sub-notes:
+ Currently, all arrays are allocated on the heap, and the rep of
an array is codegen'd as a pointer to its base. This sounds ok,
except that a reference to an array is *also* a pointer to its base.
As a result, I have inserted a bunch of hackery to make the two
semantically collapse into the same thing during the lowering/codegen
passes. Primarily, I introduced "fake dereferences" in both lvalue
and rvalue positions to make this work. Dereferencing an array
reference results in a "fake dereference", which becomes a no-op in
codegen; indexing off such a "falsely dereferenced" array has the
expected semantics of indexing the array.
Of course, if we do this fake dereferencing business, we also need
a fake "address-of" node. So, I added a fake address-of operator, and
address-of operations on arrays are lowered into fake address-of.
All this is ugly and, in the long term, probably wrong. But I
think it will get us by for now. We can revisit the issues of
1. Array and record constructors returning inline
vs. out-of-line allocations
2. Array indexing and record member dereference operations as
an lvalue/rvalue.
at some point in the future. And speaking of inline/outline
allocations...
+ The current array lowering strategy will break when I start
lowering records; among other things, inline allocations (as a member
of something else, or on the stack/global area) don't exist anymore,
post-lowering. Fortunately, records are the next thing to lower, so
I'll find some way to fix this shortly. Fixing this may enable me to
remove the semantic weirdness I describe above. On the other hand, I
may end up piling one hack on another. We'll see.
+ I have added array.wil, which tests various kinds of arrays, to
the tests/features/ directory.
+ Added AllocNode hierarchy of IRNodes, to represent a raw memory
allocation. A raw memory allocation currently returns whatever
representation it is passed in its constructor; the client is
responsible for making sure this is right.
+ Codegen support for one subclass of AllocNode (MallocNode).
As of this update, we generate one of three macros:
* "MALLOC_bits(num_words)"
* "MALLOC_bytes(num_bytes)"
* "MALLOC_words(num_words)".
The third of these may result in a system-specific multiplication
factor to the actual underlying malloc() call that's different
between 32-bit and 64-bit machines. I may make MALLOC_bits
obsolete in the near future, since I don't see any particular use
for it---a byte is always 8 bits, and machines don't let you alloc
less than 1 byte at a time.
+ Removed the NewNode from the IRNode hierarchy. Upon discussing the
semantics of "new" with Craig, we decided that "new" made more sense
as a property of the target constructor, rather than an independent
node. For one thing, the lowering generated for a constructor may
change depending on where it is allocated. Consequences of
eliminating new from the IR (it still exists in WIL, of course, as a
modifier to constructors):
+ "new" may no longer take an optional "var". The result of a
"new" expression is a reference that is always immutable. The
mutability of the underlying object is, of course, independent of the
mutability of the reference. If you want to have a mutable reference
to an object, you have to assign its address to a mutable reference
explicitly. This makes more sense: having constructors produce a
mutable pointer directly would be like saying that you can relocate a
constructed object by mutating its pointer. And, in fact, I cannot
find any instance of "new var" actually being used by wil code in the
Java stdlib.
+ Parser adjusted for updated "new" semantics.
+ Node printing for constructors updated.
+ The result representation of a ConstructorNode changes depending
on whether it is new'd or non-new'd. Rep construction/checking has
been adjusted. Code to build/check pointer-ness of an allocation,
formerly done when encountering NewNode, is now a responsibility of
ConstructorNode.
+ One nice consequence is that a "new" constructor rep and its
underlying "direct" rep do not have to be checked against each other
any more. Consistency is implicitly maintained by the fact that it's
all part of the same rep.
+ Silly statements like "decl x:(*int4) = new 3" are currently not
working properly.
+ Big TODO item: Rep checking/computation will have to be
revisited when we implement constructors whose semantics as a "new"
target are currently fuzzy. E.g.: generic function nodes and class
nodes.
+ Added alloc_space(@:ConstructorNode), and an AllocSpace hierarchy in
reps-helpers.cecil, to describe the different allocation styles
implied by "new"/non-"new". For further explanation of AllocSpaces,
see the comments, which I choose not to duplicate here. BTW, I have
moved the PtrSpace declarations to reps-helpers, which is probably
where they belonged anyway.
+ Added build_single_node_repl(...) to lower-helpers.cecil; this
method builds a single node replacement for another node, with all
incoming and outgoing edges connected exactly analogously.
----------------------------------------------------------------------
MISC. INFRASTRUCTURE CHANGES
+ Defined more macros in WIL-defs.h (${VORTEX_HOME}/runtime/wil). We
crawl ever closer towards a working compiler...
+ Moved constructors for literal nodes from ast-helpers.cecil to
ir-node.cecil, since they do not refer to the AST.
+ Removed some completely bogus AST dependencies in dfg.cecil. These
were completely bogus because if we did a graph replacement on certain
nodes, the replacement would not get recognized, because we were still
relying on the AST links as late as codegen! I have changed
ir-node.cecil, dfg.cecil, cfg2dfg.cecil, and rep-compute/rep-check to
remove these AST dependencies.
+ IMPORTANT NOTE: as a consequence, if you write an analysis that
does replacements or propagates interesting information across LValue
nodes in the DFG, you have to be aware from now on that certain edges
do not really connote just "data transfer". For example, on an lvalue
array indexing operation, there is now a data flow edge from the array
to the indexing op. This does not really mean that the array is
evaluated and its value passed to the indexing op (which is the
standard meaning of data flow edges). It just means that the array is
the base of the lvalue indexing op. See the difference between the
old version of dfg.cecil and the updated version to get the idea. The
most
+ Some tweaks to the graph builders framework, to make my life easier:
+ close_graph_with_last_added to encapsulate a common pattern:
Finishing a graph by attaching all outgoing edges to the most
recently added node. I think this will be principally useful with
the CFG; I noticed it in both unary op lowering and array
construction.
+ close_graph_with, which is like close_graph_with_last_added except
you can choose any node (It's your responsibility to choose one
that you've actually added!).
+ Parameterized the type of WindIRBuilder by the type of the WindIR
it's going to build. This should have been my design from the beginning,
but now, of course, I understand much more about Cecil type idioms
than I did last fall. By doing this I was able to eliminate some other
cruft that had accumulated, or at least rewrite the cruft more nicely.
Now I have to go back and parameterize SliceBuilder with the type of
the slice being built---but, one step at a time.
+ Changed return types of {add*,append}(@SliceBuilder) methods to
something more useful---now the returned node type is
parameterized by the type of the node passed in. Useful because
generally, the added nodes will be freshly constructed, and we can
return exact type information.
+ Tweaked some comments.
+ Added dfg-helpers.cecil, which, by analogy to ast-helpers, will hold
convenience functions for modifying or accessing DFG-specific aspects
of IRNodes.
+ Changed underlying_rep(r@:WrapperRep) so that it now recursively
computes the underlying rep of a WrapperRep whose underlying rep is
another WrapperRep.
+ In whirlwind-file-info and wil-codegen, fixed a minor bug regarding
the naming of output files. My last update stripped the ".wil"
extension from codegen'd filename prefixes; now ".wil" is back.
+ Anatomy of a subtle bug + fix:
During the global build phase, whirlwind-make calls check_changes on
the outermost scope. This phase does a comparison of each top-level
decl with the old decl from that same source code location; comparison
is done using the =_properties(@DeclNode, @DeclNode) binary method.
However, these properties may refer to uninitialized data. When we
try to compare two decls for equality and one of them has a
declared_rep that is an IDRep, the underlying rep may not be
initialized (ref(@:IDRep) is a maybe[decl_group[RepDeclNode]]). In
this case, the compiler raises an internal error and dies.
I encountered this bug when modifying some wil test files. Fixing an
undeclared rep error by declaring the rep caused an internal error on
the next compilation. I fixed this bug by overriding type tests for
IDReps in reps.cecil, but the underlying problem---IDReps may not be
initialized---may bite us again.
BTW this bug predates my last dependency-related changes, but it was
never encountered because our previous test cases had a more or less
static set of declared reps.
----------------------------------------------------------------------
TRIVIA
+ Removed some breakpoints, commented others. Now all legitimate
breakpoints are annotated, so that
grep 'breakpoint()' *.cecil
gives you some information about the actual different ways that you
can breakpoint. Handy for me so that I can quickly see what's a legit
breakpoint, versus things that I inserted temporarily, so that I can
remove the latter easily when it's time to commit something.
+ Added wil-mode.el to Emacs directory. Currently it's a very minor
diff from cecil-mode.el, but it's better than nothing. I don't have
time or will to do much more with wil-mode right now.
_______________________________________________
Cecil mailing list
Cecil@cs.washington.edu
http://majordomo.cs.washington.edu/mailman/listinfo/cecil
This archive was generated by hypermail 2b25 : Mon May 28 2001 - 00:23:05 PDT