Subject: Vortex download note (and some comments)
From: Andy Koppe (andy@dcs.ed.ac.uk)
Date: Tue Feb 22 2000 - 04:29:38 PST
Hello!
I downloaded the Linux/x86 versions of the Vortex compiler and the Cecil
frontend. I had to apply several of the fixes mentioned in notes/README to
make Vortex run on the Redhat 5.1 setup (with kernel 2.0.36, glibc 2.0 and
egcs 2.90) we have here in the CS department. I also had to change the
ArchConfig.default file to use g++ instead of gcc as linker. And I had to to
comment out the following lines of Cecil/src/stdlib/file.cecil:
-- These should be in <errno.h>, why aren't they?
prim c_++ {
extern int sys_nerr;
extern char* sys_errlist[];
};
This gave an error because the char* sys_errlist[] declaration clashed with a
const char* declaration in errno.h. I still get warnings because of passing
consts as non-const arguments but at least it works now.
I also downloaded the Java frontend but I didn't explore it further because
the usage instructions sounded a little bit too scary :)
I had first heard about Cecil when looking for information about multimethods.
Reading the language specification I was really amazed about the simplicity,
elegance and expressiveness of Cecil's concepts compared to languages like C++
or Eiffel. I had also tried Dylan but I was put off by its rather awkward
syntax.
One of the questions I wondered about was the price of Cecil's expressiveness.
My first test was of course "Hello world!". Well, a 2MB executable for that
(with noevalstdlib, no interrupt_checking, no debug_support and shared libs)
is something one has to get used to ...
Then I wrote the following micro-benchmark to check the level of optimization
Vortex can achieve:
abstract object complex;
field r(@:complex):single_float;
field i(@:complex):single_float;
method *(a@:complex,b@:complex):complex {
object isa complex { r:=a.r*b.r-a.i*b.i,
i:=a.r*b.i+a.i*b.r
}
}
let var c:complex := object isa complex {r:=1.0,i:=0.0};
print_line(time({10000000.do(&(i:int){c:=c*c})}));
I compiled this with the noevalstdlib, no interrupt_checking, no debug_support
without shared libs and with full optimization. The benchmark took 55 seconds
on a Pentium II-400.
Then I wrote the following equivalent C++ code:
#include <iostream>
#include <time.h>
struct Complex {
float r;
float i;
Complex(float real, float imag) : r(real), i(imag) {};
};
Complex operator*(Complex a, Complex b) {
return Complex(a.r*b.r-a.i*b.i,a.r*b.i+a.i*b.r);
}
main() {
Complex c(1,0);
clock_t time=clock();
for (int i=0; i<10000000; i++) c=c*c;
time=clock()-time;
cout << double(time)/CLOCKS_PER_SEC << endl;
}
This took only 0.75s. That means Cecil is about 70 times on this example. Of
course the Cecil variant is much more flexible and extensible than the C++
variant but Stroustrups law of language design states: "You don't pay for what
you don't use". Here the violation of the law is obviously caused by the fact
that the complex objects in Cecil are allocated on the garbage-collected heap
whereas the Complex objects in C++ are allocated on the stack.
I have read your technical paper about the Vortex compiler and I was amazed
again by the amount of work done on the way to fulfil Stroustrups law. Is work
on unboxing of objects which is mentioned in the last chapter still going on?
I have thought about this problem a little bit. Principally immutable objects
like complex numbers or 3D-vectors could be allocated on the stack and passed
by value instead of reference. But there is one big problem: object identity.
For example:
var a:complex := object isa complex { r:=1; i:=0 }
var b:=a
var c:complex := object isa complex { r:=1; i:=0 }
print_line(a==b) -->true
print_line(a==c) -->false
a,b and c can't be allocated unboxed on the stack because object identity
can't be determined from just the fields which each contain 1 as the real and
0 as the imaginary part.
But in fact this concept of identity does not apply here. Complex numbers are
fully identified by their real and imaginary part, there simply aren't
different complex numbers with a real part of 1 and an imaginary part of 0.
The same applies to many other classes like strings, matrices or objects which
model some classification e.g. sex or color.
It would be very hard for instance to explain the following results to someone
trying to learn Cecil without having heard about heap allocation before:
"foo" == "foo" -->true
"foo" == "fo"||"o" -->false
"fo"||"o" == "fo"||"o" -->false
"fo"||"o" = "fo"||"o" -->true
So what about getting rid of the address based identity concept altogether?
That would also be a bad idea since there obviously are many cases where it is
very useful to have a notion of identity independent of an object's
properties, e.g. for persons, cars, GUI windows, ...
I think these different kinds of classes could be modelled by having an
identity-less base class (i.e. any) and a class with predefined identity
derived from that (called identifiable?). Classes which model problem-domain
entities with a notion of identity could then be derived from identifiable and
classes which don't could be derived from any.
This distinction would make both the object model clearer and the compiler's
life easier.
I guess someone else has thought about this before but I would be glad if you
could give some comments on this.
Regards,
Andy Koppe
p.s.: I tried to subscribe to the Cecil mailing list but I got the
following error message:
Returned mail: /cse/mailing-lists/cecil-interest-request: line 5:
"|/local/admin/tools/majordomo/wrapper majordomo -l cecil-interest"...
UID 10279 is an unknown user: cannot mail to programs
-- Andy Koppe 3rd year Computer Science University of Edinburgh andy@dcs.ed.ac.uk
This archive was generated by hypermail 2b25 : Tue Oct 03 2000 - 15:21:22 PDT