Annotations for Dynamic Compilation

Official UW-DCG Design Document
Written by Markus Mock (mock@cs.washington.edu)
Date written: 7/30/96
Last modified: 8/7/96


Overview

There are two annotations to specify runtime constants: isConst() and keepConst(). The basic difference between the two is that for keepConst the dynamic compiler will keep the variable constant by specialization at those points where it would otherwise lose its constancy (if possible). To turn keepConst off, the annotation unkeepConst() is used. For convenience there is a variation of isConst / keepConst that explicitly marks the beginning and end of a dynamic region which is otherwise implicitly started by keepConst / isConst and ended by unkeepConst. [A region is actually only started iff isConst / keepConst is used on a variable that was previously not in the set of runtime constants, otherwise isConst will have no effect and keepConst will only change the specialization behavior] keepConst / isConst can be nested allowing nested dynamic regions. This may be useful if inside a dynamic region a new dynamic constant becomes known.
Both isConst and keepConst allow the specification of a caching policy. By this mechanism the programmer can control how many versions of a dynamic region will be cached. Stitching can be done eagerly at all points except when isConst / keepConst is used on a variable that is not known to be rtc at this point (=start of a dynamic region). At these points the setup / stitching will have to wait until the rtc value becomes known.

Additionally a laziness specification allows control whether dynamic regions should be stitched lazily or eagerly. The computation of laziness annotations at the IR level based on programmer annotations will be shown. Laziness annotations are orthogonal to the other specifications. In a first implementation we can simply ignore them and be as eager as possible. Likewise, the nesting of dynamic regions can be disallowed avoiding the implementation complexity and possible runtime costs. For generality and homogeneity of design they are, however, important parts of the design.

Syntax

   keepConst(<var-list> [<cache-spec>])  [<stitch-spec>]

where
<var-list> ::= list of variables (possibly empty)
<cache-spec> ::= :cache | :cache1 | :unchecked | :replicate
<stitch-spec> ::= eager | lazy | looplazy

Defaults: keepConst(<var-list> :cache) looplazy

keepConst annotations are only allowed inside of dynamic regions. The following annotation can be used to "turn keepConst off":

   unkeepConst(<var-list>)

which is also only allowed inside of a dynamic region.

   dynamicRegion [isConst (<var-list> [:cache | :cache1 | :unchecked])]
	         [keepConst (<var-list> [:cache | :cache1 |
                    :unchecked| :replicate])] [eager | lazy | looplazy]
{
	/* the dynamic region */
}

Default: unchecked for isConst
         cache for keepConst and 
         looplazy for the laziness

dynamicRegion isConst() keepConst() {} is really only a nice syntactic variation of the use of keepConst /unkeepConst with additional isConst annotations. The "}" that marks the end of the region is an implicit unkeepConst for all the variables that were listed in the keepConst declaration at the start and also for all keepConst / isConst annotations made inside that region.

   isConst(<var-list>  [:cache | :cache1 | :unchecked]) [eager | lazy]

Default: unchecked

isConst can be used anywhere inside the dynamic region. It adds the variables in the list to the set of rtc's at that point. No effort is made to keep them constant, though.

Semantics

keepConst

keepConst is used to tell the dynamic compilation system that the variables specified in the list of variables are to be kept constant from that point on downstream to the end of the dynamic region or until killed by "unkeepConst". Depending on how much effort we want to put into the attempt to keep a variable constant, different alternatives for the semantics of keepConst are possible, we'll list them in order of increasing effort (including isConst as a starting point): We'll use the following example to illustrate the different levels of effort:

A:	
	if (d1)
	   x = c1;
	   y = d2;
A1:
	   z = d2;
	else
	   x = c2;
	   y = c2;
	   z = c2;
M: 
	/* dynamic merge */
	/* c1 c2 are runtime constants *, d1,d2  dynamic values */

Example 2:

A:
	x = d1;
P:

isConst
A: isConst(x);

At M and P x will lose its constancy. No caching, no code replication.

keepConst (level 1)
A: keepConst(x);

At M x will remain constant by code replication. A cache lookup at M will be done (unless unchecked or replicate is specified as cache policy). In example 2, x would lose its constancy.

keepConst (level 2)
Like 1 but in example 2 x will remain constant by doing a cache check at P: and specializing code downstream of P. This will also keep y constant at M in the first example if keepConst(y) is used at point A.

keepConst (level 3)
A:	keepConst(x,y)
	if (d1)
	   x = c1;
	   y = d2;
A1:	   keepConst(z);
	   z = d2;
	else
	   x = c2;
	   y = c2;
	   z = c2;
M: 
	/* dynamic merge */
	/* c1 c2 are runtime constants *, d1,d2  dynamic values */

Using keepConst on z only on one branch of the IF, we keep it constant on one branch (specializing and caching at M). On the other branch we use an unspecialized version (w.r.t. z) of code. This requires duplicating the CFG during BTA as two versions of code will be produced at compile time: one with specialization for z, one without (after M).

Using keepConst in the interpretation of level (iii) gives the highest expressiveness. If duplication of the CFG during BTA shall not be done, the keepConst for z can be put before the IF. If after assignments of dynamic values a variable shall no longer be considered constant, it can be turned off explicitly before the assignment by using unkeepConst. To keep a variable constant three situations can arise:

For caching the following alternatives policies are provided:

cache
An infinite number of versions will be cached. The cache will be checked for the needed version. If present, it will be used, otherwise the new version will be produced and cached along with versions already in the cache, growing the cache by one.

cache1
Exactly one version will be cached. Like a) except that a new version replaces the old version in the cache maintaining the cache size at 1.

unchecked
Exactly one version will be cached, but no check will be done to ascertain that the cached version is the correct one. The programmer asserts that the value of the variable will never change, if this is not actually the case, the results will likely be incorrect (the cached version will be used w/o a check and will likely have an incorrect value). NB Note that "exactly one" refers to the current context, i.e. for each context one version will be cached.

replicate
No cache probing will take place. Each time a new version will be produced, even for values encountered before. This obviates the possibly expensive cache check and will be useful for loop unrolling where each iteration value will be produced only once (e.g. in simple counting loops).

For one particular variable only one caching policy is allowed. How different cache policies for different variables can be mixed and matched is explained in section 2.1. keepConst, however, can be used on a variable that was previously listed in a isConst list. This will turn specialization on for that variable.

What will be used as cache keys and where cache checks will be done is explained in section 3.

The laziness annotation is used to control if new versions of code are produced in an eager or lazy way: eager will be as eager as possible in the stitching of generated code possibly leading to big code blowup. Looplazy is less eager and will not try to completely unroll loops eagerly. Lazy is even less eager and will only stitch code as needed. The exact semantics are specified on the IR level in section 2.

unkeepConst

unkeepConst(x) is used to tell the compiler to not keep variable x constant anymore from the point of the unkeepConst annotation. Specialization for variable x is turned off then.

isConst

isConst inside a dynamic region is used add variables to the set of current runtime constants. The variable will keep its property as runtime constant until it possibly loses that property at some dynamic merge. No attempt is made to keep the variable constant, i.e. no specialization is done. Essentially the BTA treats such a variable like a derived rtc just that the "derivation" is by means of the programmer annotation isConst. The difference compared to keepConst is the reduced effort (namely no specialization) to preserve the constancy property.

The caching policies "cache" and "cache1" can be used to insert a cache check and cache multiple versions. In the default case "unchecked" it is the programmer's responsibility to make sure that the variable always has the same value at that point (in the same context).

Note:

P: isConst(x:C) 
will have no effect if x \in RTC at point P, i.e. if x is known to be a runtime constant at that point (for instance if it is a compile time constant). This is also the reason why a laziness annotation is unnecessary on isConst: either the variable is already know to be a runtime constant, then isConst remains without effect. If it is not a runtime constant at that point, eager stitching would be impossible at any rate. The cache policy that is specified for isConst only applies to the caching of the new region that is started by it (the case where it does not start a region can be ignored as in this case the variable was already known to be a rtc). It will not apply to any subsequent discordant merges as the constancy property will be lost there. Therefore, the following annotation is fine:
x = read();
isConst(x:cache1);
.. uses of x 
x = x +2 /* x is still rtc */

keepConst(x:cache); /* x is rtc at that point so keepConst does not
		       start a new region, cache policy for
		       specializations: cache */

if (d) 
   x = x + 2;
else
   x = x + 3;

M: /* for the specialization at this point the cache policy is: cache
	
    if no keepConst were used at this point x would no longer be rtc
	*/



	

Another Example:

	dynamicRegion isConst(x) {
	   y = d;	/* d some dynamic variable */
I:	   isConst(y)
           z = y + x + 3;
	   .. some uses of x, y, z ..
P:	   y = y + d 	/* d some dynamic variable */
		
M:	   x = x + d

	}

y is treated as a constant from point I to point P where it loses its constancy in the BTA. So z is also a (derived) constant as both y, x and 3 are rtc. At M x retains its constancy by specialization,

The programmer must assure (as y is unchecked) y'v value is the same at point I for the same context. Here the context is the value of x, i.e. y = f(x) for some function f.

If isConst and keepConst are used on the same variable, keepConst takes precedence. Using keepConst twice on the same variable (before unkeepConst) is allowed provided that the caching / laziness annotations are the same.

dynamicRegion

The dynamicRegion annotation is used to explicitly mark the beginning and extent of a dynamic region. It must be used for the outermost dynamic region to mark which portion of the code is to be compiled dynamically. Inside of it isConst and keepConst annotations can be used as well as more nested dynamicRegion annotations. Its other purpose is to specify the set of variables that are assumed to be constant throughout the region, given in the variable list isConst(). Those variables that are runtime constanst at the beginning of the region and whose constancy is to be preserved (if possible) by specialization are listed in keepConst.

The caching specification is used to specify how to deal with multiple versions (of the dynamic region), analogous to the keys in our previous dynamic compilation design. Using "cache" specifies an infinite size cache of versions, "cache1" a cache of size 1 (i.e. a new version will replace an existing old one). At the start of the region a check of the cache needs to be made when using cache or cache1. Cache1 will be useful if the incoming values exhibit temporal locality, i.e. the same set of values enters the region for some time then changes and then stays the same again for some time. If that is not the case, "cache" should be used, which is also the default. "unchecked" is like cache1 except that no cache check is done. The programmer asserts that the variable will only have one value and won't change.

Nested Dynamic Regions

dynamicRegion annotations can be nested.

For example:

dynamicRegion keepConst(x) {

   ... /* R1 region */
   dynamicRegion keepConst(y) {

      ... /* R2 nested region */
   }
   ... /* R1 */
}

In region R1 only x will be kept constant (assuming no other keepConst annotations within R1), in R2 all variables of surrounding regions (here R1) will be kept constant and the variables declared for region R2. In this case, both x and y are kept constant inside R2.

In the case where y is invariant with respect to x at the start of region R2 (i.e. y = f(x), f a function) caching of y would be superfluous. In this case, however, using the isConst annotation will achieve the same effect inside R2 while obviating the need for caching on y. A better way to write the annotations then is:

dynamicRegion keepConst(x) {

   ... /* R1 region */
   dynamicRegion isConst(y:unchecked) {
      ... /* R2 nested region */
   }
   ... /* R1 */
}

Annotations for function specialization

There are two annotations to support specialization and runtime constants across function calls.

Syntax

Global Annotation
specialize [constant] f() on ();

is a global annotation. is a subset of the formals list in and lists the arguments on which specialization should be done. "constant" is used to assert that the function is both side-effect free and idempotent i.e. that calling the function once has the same effect as calling it n-times with the same arguments. This information can be used profitably by the dynamic compilation system if all arguments are run-time constants. Then the function call can be replaced by a constant which is computed at setup time by calling the function once.

Call Site Annotation
a = specialize [constant] f() on () is a call site annotation. is a subset of and lists the arguments on which f is to be specialized. "constant" is used to mark f as side-effect free and idempotent at this particular call site (the arguments f is called with at this site could exhibit a special pattern making f constant which when called with general arguments might not be).
If both the global and the call-site annotation are used on the same function, the programmer has to assure that they agree: i) the key list in the call site annotation must be a super set of the key list of the global annotation ii) a function labeled constant in the global annotation will have to be global at each call site

Implementation Aspects

If one or more arguments of a specialized function are known to be runtime constants at the call site, specialization can be optimized exploiting this property: Consider a = f(x1, .. xn) specialized on (x1, .. xm), m<=n

A. {x1, .. xn} \ intersect RTC = {}
(no argument is a runtime constant)
Caller:

A call to a specialized version of f passing all arguments is
generated. 

Callee:

A specialized version of f is compiled annotating all arguments that
are part of the specialization key (x1, .. xm) as runtime constant.
Versions for each key-tuple value will be created and cached as
needed. A limit on the number of versions created could be imposed
calling the general version of f in the default case where no new
specialized version can be made available.



B. {x1, .. xn} \ intersect RTC = {x1, .. xk} k < n
(at least one argument is a runtime constant)

Caller:
Runtime constant arguments do not need to be passed.
A call to a specialized version f{x1,..xk}(xk+1,..,xn) is generated
passing only the non-constant arguments.

Callee:
f is compiled annotating all arguments that are used for
specialization as rtc, The runtime constant arguments which are not
passed 

C,  {x1, .. xn} \ intersect RTC = {x1, .. xn}
(all arguments are runtime constants)

Is like B except when f is additionally labeled as "constant". Then f
can safely be called only once and the result can be reused
thereafter.

Caller: 
The function call is done once at setup, the function call in the code
is replaced by a hole for the runtime constant that is computed during
setup.

Callee:
The original version of f can be used to compute the rtc value during
setup. No specialized version is necessary.

Separate Compilation

If separate compilation of f and callers of f is to be supported, the optimizations that can be done are limited. Using the global annotation for f, f can still be specialized by annotating the arguments on which to specialize as runtime constants in the function body. These annotations must agree with call site annotations then as all call sites will share the one specialized version of f. All arguments to f will need to be passed always, even if they are runtime constants at the call site. If all arguments are runtime constants at the call site and the function is annotated as "constant", the call-once optimization can still be done.

Modification to the BTA

One new flow function:
Cflow [[x = f(x1, .. xn)]] cs = cs U {x} U Kconst(p) 
		if f is annotated as constant and {x1,..,xn} \subset RTC



Last updated August 7, 1996.
Markus Mock (mock@cs.washington.edu)