In this meeting we decided on what to tackled in the first cut of
RTCG under Multiflow. We went over a list of features that we might
want and chose a subset.
We will do the following:
- We will stick to the basic structure that Marc Friedman developed
for RTCG under gcc:
- We will use the same basic notion of dynamic constants and dynamic
regions that Marc used.
- We will depend on the programmer to provide pragmas that identify
dynamic regions and constants.
- Our offline compilation will generate setup code, templates, and
stitcher directives. Templates will be sequences of code with holes
where dynamic constants can fit in. Stitcher directives will tell the
stitcher how precisely to convert the templates into executable code,
once the values of run-time constants are known.
- As with gcc, we will "trick" the multiflow compiler into
producing templates as if it where producing executable code. This
lets us avoid modifing the middle of the multiflow compiler as much as
possible. At an early stage of the compilation we will modify the
program's IL so that run-time constants are represented as multiflow
LTCONST's. These should pass through most optimizations untouched. We
will also use put optimization barriers around and inside dynamic
regions to insure that divisions between static and dynamic code (and
between subtemplates) survive through optimization.
We will guarantee correctness for programs no matter what values
run-time constants finally have. This differs from the gcc version,
which could only handle constants that fit into an Alpha immediate
field.
We will use sub-templates to handle situations where whether code
is executed (or the version of code that is executed) depends on
run-time constants. Sub-templates are templates for part of dynamic
region. The stitcher will combine sub-templates based on stitcher
directives and values of run-time constants. If we can do them
efficiently enough, we may use sub-templates for all cases in which
the of a the size of code depends of run-time constants, including
inserting run-time constants into an instruction (which may take 1
instruction if the constant fits in an immediate field, or many if it
must be built into a register).
We will do proper full loop unrolling, including loops whose
bounds are determined by a run-time constant (such as a string or
linked-list). The offline compiler will generate a sub-template
corresponding to a single iteration of the loop body and perhaps also
sub-templates corresponding to multiple (unrolled) iterations. These
will be combined by the stitcher. We will careful never to introduce
infinite loops where they did not exist before. Induction variables
will be treated as run-time constants as well. A consequence of this
is that the number of run-time constants is now determined at run
time, not offline.
We will also handle "if" and "switch" statements using
sub-templates. In general there may be sub-templates nested in
sub-templates if there are nested loops or "if's" or "if's" in loops
or such.
We won't include the following:
- We won't guarantee correctness for programs that change the value
of run-time constants, neither will we support rarely changing
constants.
- We won't concentrate much effort on highly optimized code for
constants. As examples, we won't do such tricks as using integer to
float convert with immediate to generate integral floating point
constants or building up constants faster with tricky bit operations.
- We won't do memory disambiguation and register allocation of
run-time constant pointers.
- We won't do even simple versions of standard optimization:
scheduling, dead assignment elimination, copy propagation. We will be
doing some dead-code elimination with sub-templates for if's and
switches. In general, we hope that the offline compiler will be able
to do a reasonable job without knowning the precise values of dynamic
constants. For loops we hope that offline-compiled, unrolled
sub-templates will be well enough scheduled and register allocated.