Regression-Test Script

This page (and the script!) is nearly finished, but still under construction. Direct questions to Brian (grant@cs).

Our regression-test script is rt. It is in ~grant/project/validate under AFS. To use the script, you need to specify the applications, which compiler you want to use, the directory containing dyn.h and lib*dyn.a, and whether you want to test the applications for correctness or to time them.

Using rt

The applications are listed on the command line. Currently, only dispatcher and sparse have been tested with rt. Soon, robot, matrix, stack, and sort will be setup for use with rt. Pov, xlisp, dinero, and other applications will be added to the suite as we get them to work.

The other information can either be specified by environment variables or by arguments to rt, or by some combination of the two. DYNRTOPT specifies whether you want to test for correctness or time. DYNCOMPILER specifies which version of the compiler you want to use. DYNLIBDIR specifies the directory to contain dyn.a and the DC libraries. The standard directory is /afs/cs/project/dyncomp/lib, and is the default. rt also knows the default value of DYNINCLUDE, and you probably will never need to change it. This is summarized in the table below.

Variable rt Option Description and Possible Values

DYNRTOPT -TEST/-TIME Whether to test or time (TEST or TIME)

DYNCOMPILER -C Compiler to use (osf, dc, dcm, oracle)

DYNLIBDIR -L Full path of directory containing dyn.h, libdyn.a, and libtimedyn.a

DYNINCLUDE -I Full path of Multiflow include directory

Variable	rt Option	Description and Possible Values
DYNRTOPT	-TEST/-TIME	Whether to test or time (TEST or TIME)
DYNCOMPILER	-C	Compiler to use (osf, dc, dcm, oracle)
DYNLIBDIR	-L	Full path of directory containing dyn.h, libdyn.a, and libtimedyn.a
DYNINCLUDE	-I	Full path of Multiflow include directory

If your application succeeds, rt will respond with "Application FOO passed". If your application fails the test or crashes for some reason, you'll get a message like "Application FOO failed (log in FOO/log...)". You can then check the log file for the source of the problem.

Here is an example execution:
rt -Cdcm -TIME dispatcher sparse

Setting up your app to run under rt

The way rt compiles and invokes the apps is through a standard makefile format. An example is in ~grant/project/validate/sparse/makefile, and looking at it is the easiest way to figure out what needs to be done.

The rt script sets five variables for the makefile: DYNCOMPILER (if not already set), DYNCOMPFILE, DYNCFLAGS, DYNLIBS, and DYNRTOPT (if not already set). The

The run rule should compile and run the code (the script just cds to the directory and does "make run"). Also, the makefile needs to exit with a nonzero exit code if the output doesn't match the canonical output in TEST mode.

The run-oracle rule.

Look at spDriver.c in the same directory to get an idea of how to conditionally compile output statements.

TEST mode will cause the apps to be compiled with the specified compiler, run, and compared against the apps' canonical output. TIME mode will cause the apps to be compiled with the specified compiler and run, saving the times in output files in each directory. Currently, setup time is printed automatically in programs compiled with dc. The compiler should not cause any output to be generated! Variables for timing setup code need to be created, like what has been done for timing the stitcher.

Specifying TEST compiles the files with -DDYN_TESTONLY and TIME compiles the files with -DDYN_TIMEONLY.

What to put in #ifdef DYN_TESTONLY regions:

All printfs other than those generating timing information

Print out enough information that correct behavior can be determined by looking at the output

Everything you print should be the same for every execution given the same input (don't print out your process id!)

What to put in #ifdef DYN_TIMEONLY regions:

Time whole program in seconds and clocks

Time setup code for each dyn region (s, clks) [or total setup time]

Time stitcher for each dyn region (s, clks) [or total stitch time]

Average execution time for each dyn region (s, clks) [or all regions]

# times each dyn region is executed

What to put in #ifndef UNDYN regions:

Declarations and uses of dyn_stitch_*_time and similar variables for setup timing once they are available (dyn_setup_*_time)

dyn_setup_wall_time and dyn_setup_cycle_time are now setup like dyn_stitch*. Use dyn_now_wall() and dyn_now_cycle() to collect your own times. Both return doubles. See sparse for an example. Note that you will only get setup times when running codes compiled with dc and only get stitcher times when running codes compiled with dcm.

For example, here is what sparse currently produces for

DC:
time for multiply loop (100000 iterations) = 9.394881e-01 s, 2.106253e+08 cycles
setup time = 9.759665e-04 s, 2.204460e+05 cycles
stitch time = 0.000000e+00 s, 0.000000e+00 cycles
execution time = 9.570560e-01 s, 2.117659e+08 cycles

DCM:
time for multiply loop (100000 iterations) = 3.679310e-01 s, 8.262723e+07 cycles
setup time = 0.000000e+00 s, 0.000000e+00 cycles
stitch time = 9.760857e-04 s, 1.589530e+05 cycles
execution time = 4.079469e-01 s, 8.392783e+07 cycles

ndiff, a program that compares floating-point values while allowing for small differences, is installed in ~grant/project/validate/bin for anyone who needs it (which is just me, I think).

Usage: ndiff file1 file2 %tolerance

1e-4 is a reasonable value for %tolerance, I think. That is what I use for testing the output from sparse.

Regression-Test Script

Using rt

Setting up your app to run under rt

Last updated May 28, 1996. Brian Grant (grant@cs.washington.edu)

Last updated May 28, 1996.
Brian Grant (grant@cs.washington.edu)