Re: Typing collect


Subject: Re: Typing collect
From: Craig Chambers (chambers@cs.washington.edu)
Date: Tue Jan 11 2000 - 08:53:00 PST


On Mon, 10 Jan 2000 14:05:53 PST, Jonathan Aldrich wrote:
> When hacking up some Cecil code the other day, I decided it would be nice
> to have the library function collect, from Smalltalk. What collect does
> is it takes a collection of elements and a closure, applied the closure to
> each element in turn, and returns a collection of the results. The result
> collection should be the same kind of collection as the source
> collection.

This function is roughly the same as the standard map function from ML
and other functional languages. Except that map always takes and
returns a list, while collect takes most any kind of collection and
returns a collection of the same "kind," for some subtle definition of
"kind."

Cecil has similar functions, except that they fix the target
collection kind: new_[i|m]_[vector|vstring]_init_from. I've long
wanted to design some sort of abstract factory system for Cecil so
that we can have the class of the receiver determine the class of the
result, more like Smalltalk. And call the resulting function map.
And add map_keys (aka map_indices) and map_associations to support
other kinds of arguments to closures. But there are type system
issues, as Jonathan points out, and there's some amount of design and
implementation work to set up the abstract factory system. But it
would be very useful.

>
> This proved to be an interesting typing problem. I don't think it can be
> done elegantly in the current Cecil system. Here is what I imagine the
> well-typed code might look like:
>
> ---------------------------------
>
> method collect(c@:`T[`S] <= Collection[S], cl@:&(S):`R):T[`R]
> where exists type T[R]
> {
> let result := new T[R]();
> c.do(&(e:S) {
> result.add(cl.eval(e));
> });
> result;
> }
>
> ---------------------------------------

One issue is that you can't say `T[`S], where you're parameterizing
over a parameterized type. You can say `T (where T will be bound to a
*ground* or instantiated type), or you can say F[`S], where F is a
named parameterized type (and S will be bound to the appropriate
*ground* type instantiating F). But not `T[...]. To do this would
require a fancier kinded type system (kinds are types of types; in our
type system they'd define what patterns of type parameters a
parameterized type variable would take), and there are lots of
decidability questions about type checking and local type inference in
such type systems. (There's a paper in POPL'00 on singleton kinds,
which are related to kinded type systems used in modular typechecking
of ML-like languages. And which might be very relevant to a
parameterized module extension of Dubious.)

>
> There's some new syntax here. First of all, since I wanted to only define
> collect once (not once for each collection class), I need to parameterize
> over both the type of collection and the type of the elements: I don't
> think you can easily do this in Cecil, especially the part about
> returning the same type of collection but with a different element
> type. The way to simulate this in current Cecil is to write
> O(n) different collect functions.

There's a difference between writing O(n) different signatures, each
giving more refined result types from more refined argument types, and
writing O(n) different method implementations. You may be able to
write only one implementation, but still have to write a bunch of
signatures (at least if you want to know more precise information
about result types for certain argument types). (The current
implementation-side type checker will have problems with a bunch of
signatures for the same method implementation, but that's a weakness
of our current implementation & particulars of the connection between
implementations and signatures, not an inherent problem.)

>
> Second, I needed to express the fact that the method only applies if the
> collection type can hold the result type. One could do this with a
> Collectible[T] F-bounded type, but this is inelegant because it may
> require a declaration for each collection type.

This is one reason that we don't have a general map in Cecil today:
you can't take a string (a collection of characters) and map each
character to an integer, and then still return a string (since you
have to return a collection of integers, and strings only hold
characters). So mapping a closure over a string should return e.g. a
vector. So this map function is an example of an irregular signature,
since it doesn't have a uniform pattern where the type of the result
is the same as (or a different instantiation of) the type of the
receiver. (Arithmetic operations on numeric types, e.g. *, have a
similar irregular pattern. You might think * has signature T*T->T,
for any T <= num, but if you have small_int vs. big_int types, then
small_int*small_int->(small_int|big_int). Really, small_int & big_int
are different implementations of int, and int should be considered a
leaf type as far as computing the most specific instantiation of the
T*T->T signature.)

>
> Third, I needed a better creation mechanism, parameterized by both the
> type of collection and the type of parameter. This may be the only place
> where C++ is more powerful than Cecil<grin>. Again this requires
> O(n) new-function declarations in Cecil.

Right, you can't say "new T[R]()", or any thing like "new T" where T
isn't a statically known class. (You'd really say "object isa T[R]"
in Cecil, but I assume that's a typo.) Today in Cecil you can say
"c.copy_empty" instead of the above, for most kinds of collections c.
In a module-based version of Dubious with first-class objects as
types, it may be possible to write and type-check this kind of thing.

My collection species (aka abstract factory) plan would support other
kinds of ways of creating a collection from an existing one. Indexed
collections would support more kinds of creation operations than
tables, which support more than generic collections.

>
> Anyone have comments, or know of type systems that can do this? Vass, is
> this an example the type systems you've been playing around with can
> handle?

I don't think it's in Vass's current plans to study this. There are
some related issues in the future Dubious efforts.

>
> Jonathan :-)
>

Thanks a lot for posting this challenge problem, and analyzing the
issues so well. This is exactly the kind of example that helps us
identify important goals for our type system work.

-- Craig



This archive was generated by hypermail 2b25 : Tue Oct 03 2000 - 15:21:19 PDT