In collection.diesel:
Collections are groups of elements. We first introduce abstract interfaces, describing and refining various kinds of collections and their operations. We then shift to discussing concrete implementations which can be instantiated and manipulated at runtime.
There are three major families of collections: unordered collections (like bags and sets), ordered collections (e.g., lists), and tables (or keyed collections). Indexed collections like arrays, vectors, and strings are both ordered collections and keyed tables, where the keys are integer indices.
Abstract collection classes typically come in three versions:
read-only, immutable, and mutable, with the latter two indicated by
i_
and m_
prefixes. Moreover, the read-only and immutable
versions come in ``generic'' and ``exactly'' subversions. For example:
i_vector
) adds the immutability
specification; while the immutable variety might not add any new
operations compared to the read-only interface, it does guarantee
that the collection does not change after it is created. Another
subclass (e.g., m_vector
) adds the mutator operations. (An
immutable collection may contain mutable objects which may be
side-effected while in the collection, but the collection itself
cannot be changed.) Each of these versions inherits from the
corresponding version(s) of its superclass(es), e.g. i_vector[T]
inherits from i_indexed[T]
while m_vector[T]
inherits from
m_indexed[T]
. (Some abstractions omit one or both of the
immutable or mutable subclasses.)
Separating the read-only from the read-write interfaces (e.g.,
vector vs. m_vector
) allows the read-only versions to support
richer subtyping relationships. For example, a vector[T] object
is a read-only vector of objects, all of which are subtypes of T.
Such a collection can be safely viewed as a vector[S] for any S
that is a supertype of T, since such a collection is also a
read-only vector of objects, all of which are subtypes of S. For
example, vector[int] is a subtype of vector[num]. We say that
the vector type is covariant in its parameter type, and this
covariance is indicated by the vector[`T] isa ..., vector[`S >= T]
declaration.
Whether or not a class can safely be declared to be covariant (or contravariant) in a type parameter is determined by how functions that take an argument of that class take other arguments that also mention the type parameter. Mentions can be ``positive'' or ``negative'', where the result type is positive, an argument type is negative, and polarity reverses inside an argument of a closure type. Covariant type parameters can only appear in positive positions (like results and arguments of argument closures), while contravariant type parameters can only appear in negative positions (like arguments). Type parameters that appear in both positive and negative positions across all the functions taking values of collections of that type parameter can be neither covariant nor contravariant. (The truth is somewhat more subtle than this, but it's true to a first approximation.)
Read-only collections like vector[T] only support functions all of
whose occurrences of T appear in positive positions, and so their
type paraters are covariant. This includes most (but not all)
reader operations, like fetch and do. Mutable collections
support functions whose occurrences occur in both positive
(e.g. fetch) and negative (e.g. store) positions, so their type
parameters are not covariant. For example, m_vector[int]
is
unrelated to m_vector[num]
. Indeed, m_vector[int]
cannot have
floats stored in it (unlike m_vector[num]
), and so m_vector[int]
is not a subtype of m_vector[num]
. Also, m_vector[num]
can
contain things other than ints and reveal this through do,
pick_any
, etc. (unlike m_vector[int]
), so the reverse subtyping
relation doesn't hold, either.
A few reader operations have type parameters in negative positions,
such as includes and =. Consequently (to a first
approximation), such operations cannot be supported by the generic
read-only collections like vector[T] with covariant type
parameters. To allow them, the _exactly
versions of read-only
collections are introduced. The type parameter of an _exactly
collection is not covariant, and so it can support arbitrary
operations. E.g. vector_exactly[T]
is a subtype of vector[T]
(and therefore also vector[S] for any S that's a supertype of
T) but not vector_exactly[S]
for any S other than T.
(There's no need for a m_vector_exactly[T]
type, since
m_vector[T]
already is not covariant.)
To support the most reuse, then, the read-only non-exactly interface
should be used as a type declaration whenever possible. Only if
mutation, immutability, or certain comparison operations are
required should the mutable, immutable, or _exactly
subtype
interface be used. Observe that the immutable interface must be
distinguished from the read-only interface. Indeed, a mutable
collection subtypes the read-only interface, but not the immutable
interface (which would violate the behavioral guarantee of
immutability).
There are three common varieties of mutation: replacement (removing
one element but inserting another in its place), insertion
(increasing the collection's size), and deletion (making the
collection smaller). The latter two
capabilities are captured by the abstract extensible_collection[T]
and removable_collection[T]
classes, which may be inherited by
mutable collections.