Perl6::Bible::A12(3) User Contributed Perl Documentation Perl6::Bible::A12(3)NAME
Apocalypse_12 - Objects
AUTHOR
Larry Wall <larry@wall.org>
VERSION
Maintainer: Larry Wall <larry@wall.org>
Date: 13 Apr 2004
Last Modified: 22 Nov 2005
Number: 12
Version: 7
The official unofficial slogan of Perl 6 is "Second System Syndrome
Done Right!". After you read this Apocalypse you will at least be
certain that we got the "Second System" part down pat. But we've also
put in a little bit of work on the "Done Right" part, which we hope
you'll recognize. The management of complexity is complex, but only if
you think about it. The goal of Perl 6 is to discourage you from
thinking about it unnecessarily.
Speaking of thinking unnecessarily, please don't think that everything
we write here is absolutely true. We expect some things to change as
people point out various difficulties. That's the way all the other
Apocalypses have worked, so why should this one be different?
When I say "we", I don't just mean "me". I mean everyone who has
participated in the design, including the Perl 6 cabal, er, design
team, the readers (and writers) of the perl6-language mailing list, and
all the participants who wrote or commented on the original RFCs. For
this Apocalypse we've directly considered the following RFCs:
RFC PSA Title
=== === =====
032 abb A method of allowing foreign objects in perl
067 abb Deep Copying, aka, cloning around.
092 abb Extensible Meta-Object Protocol
095 acc Object Classes
101 bcc Apache-like Event and Dispatch Handlers
126 aaa Ensuring Perl's object-oriented future
137 bdd Overview: Perl OO should I<not> be fundamentally changed.
147 rr Split Scalars and Objects/References into Two Types
152 bdd Replace invocant in @_ with self() builtin
163 bdd Objects: Autoaccessors for object data structures
171 rr my Dog $spot should call a constructor implicitly
174 bdd Improved parsing and flexibility of indirect object syntax
187 abb Objects : Mandatory and enhanced second argument to C<bless>
188 acc Objects : Private keys and methods
189 abb Objects : Hierarchical calls to initializers and destructors
190 acc Objects : NEXT pseudoclass for method redispatch
193 acc Objects : Core support for method delegation
223 bdd Objects: C<use invocant> pragma
224 bdd Objects : Rationalizing C<ref>, C<attribute::reftype>, and
C<builtin:blessed>
244 cdr Method calls should not suffer from the action on a distance
254 abb Class Collections: Provide the ability to overload classes
256 abb Objects : Native support for multimethods
265 abc Interface polymorphism considered lovely
277 bbb Method calls SHOULD suffer from ambiguity by default
307 rr PRAYER - what gets said when you C<bless> something
335 acc Class Methods Introspection: what methods does this object
support?
336 bbb use strict 'objects': a new pragma for using Java-like
objects in Perl
These RFCs contain many interesting ideas, and many more "cries for
help". Usually in these Apocalypses, I discuss the design with respect
to each of the RFCs. However, in this case I won't, because most of
these RFCs fail in exactly the same way--they assume the Perl 6 object
model to be a set of extensions to the Perl 5 object model. But as it
turns out, that would have been a great way to end up with Second
System Syndrome Done Wrong. Perl 5's OO system is a great workbench,
but it has some issues that have to be dealt with systematically rather
than piecemeal.
Some of the Problems with Perl 5 OO
A little too orthogonal
It has often been claimed that Perl 5 OO was "bolted on", but that's
inaccurate. It was "bolted through", at right angles to all other
reference types, such that any reference could be blessed into being an
object. That's way cool, but it's often a little too cool.
Not quite orthogonal enough
It's too hard to treat built-in types as objects when you want to.
Perl 5's "tie" interface helps, but is suboptimal in several ways, not
the least of which is that it only works on variables, not values.
Forced non-encapsulation
Because of the ability to turn (almost) anything into an object, a
derived class had to be aware of the internal data type of its base
class. Even after convention settled on hashes as the appropriate
default data structure, one had to be careful not to stomp on the
attributes of one's base class.
A little too minimal
Some people will be surprised to hear it, but Perl is a minimalist
language at heart. It's just minimalistic about weird things compared
to your average language. Just as the binding of parameters to @_ was
a minimalistic approach, so too the entire Perl 5 object system was an
attempt to see how far you could drive a few features. But many of the
following difficulties stem from that.
Too much keyword reuse
In Perl 5, a class is just a package, a method is just a subroutine,
and an object is just a blessed referent. That's all well and good,
and it is still fundamentally true in Perl 6. However, Perl 5 made the
mistake of reusing the same keywords to express similar ideas. That's
not how natural languages work--we often use different words to express
similar ideas, the better to make subtle distinctions.
Too difficult to capture metadata
Because Perl 5 reused keywords and treated parameter binding as
something you do via a list assignment at run-time, it was next to
impossible for the compiler to tell which subroutines were methods and
which ones were really just subroutines. Because hashes are mutable,
it was difficult to tell at compile time what the attribute names were
going to be.
Inside-out interfaces
The Perl 5 solution to the previous problem was to declare more things
at compile time. Unfortunately, since the main way to do things at
compile time was to invoke "use", all the compile-time interfaces were
shoehorned into "use"'s syntax, which, powerful though it may be, is
often completely inside-out from a reasonable interface. For instance,
overloading is done by passing a list of pairs to "use", when it would
be much more natural to simply declare appropriate methods with
appropriate names and traits. The "base" and "fields" pragmas are also
kludges.
Not enough convention
Because of the flexibility of the Perl 5 approach, there was never any
"obvious" way to do it. So best practices had to be developed by each
group, and of course everyone came up with a slightly different
solution. Now, we're not going to be like some folks and confuse
"obvious" with "the only way to do it". This is still Perl, after all,
and the flexibility will still be there if you need it. But by
convention, there needs to be a standard look to objects and classes so
that they can interoperate. There's more than one way to do it, but
one of those is the standard way.
Wrong conventions
The use of arrow where most of the rest of the world uses dot was
confusing.
Everything possible, but difficult
The upshot of the previous problems was that, while Perl 5 made it easy
to use objects and classes, it was difficult to try to define classes
or derive from them.
Perl 5 Non-Problems
While there are plenty of problems with Perl 5's OO system, there are
some things it did right.
Generating class by running code at compile time
One of the big advances in Perl 5 was that a program could be in charge
of its own compilation via "use" statements and "BEGIN" blocks. A Perl
program isn't a passive thing that a compiler has its way with, willy
nilly. It's an active thing that negotiates with the compiler for a
set of semantics. In Perl 6 we're not shying away from that, but
taking it further, and at the same time hiding it in a more declarative
style. So you need to be aware that, although many of the things we'll
be talking about here look like declarations, they trigger Perl code
that runs during compilation. Of such methods are metaclasses made.
(While these methods are often triggered by grammar rule reductions,
remember from Apocalypse 5 that all these grammar rules are also
running under the user's control. You can tweak the language without
the crude ax of source filtering.)
There are many roads to polymorphism
In looking for an "obvious" way to conventionalize Perl's object
system, we shouldn't overlook the fact that there's more than one
obvious way, and different approaches work better in different
circumstances. Inheritance is one way (and typically the most
overused), but we also need good support for composition, delegation,
and parametric types. Cutting across those techniques are issues of
interface, implementation, and mixtures of interface and
implementation. There are multiple strategies for ambiguity resolution
as well, and no single strategy is always right. (Unless the boss says
so.)
People using a class shouldn't have to think hard
In making it easier to define and derive classes, we must be careful
not to make it harder to use classes.
Trust in Convention, but Keep Your Powder Dry
So to summarize this summary, what we're proposing to develop is a set
of conventions for how object orientation ought to work in Perl 6--by
default. But there should also be enough hooks to customize things to
your heart's content, hopefully without undue impact on the
sensibilities of others.
And in particular, there's enough flexibility in the new approach that,
if you want to, you can still program in a way much like the old Perl 5
approach. There's still a "bless" method, and you can still pretend
that an object is a hash--though it isn't anymore.
However, as with all the rest of the design of Perl 6, the overriding
concern has been that the language scale well. That means Perl has to
scale down as well as up. Perl has to work well both as a first
language and as a last language. We believe our design fulfills this
goal--though, of course, only time will tell.
One other note: if you haven't read the previous Apocalypses and
Exegeses, a lot of this is going to be complete gobbledygook to you.
(Of course, even if you have read them, this might still be
gobbledygook. You take your chances in life...)
An Easy Example
Before we start talking about all the hard things that should be
possible, let's look at an example of some of the easy things that
should be easy. Suppose we define a Point object that (for some
strange reason) allows you to adjust the y-axis but not the x-axis.
class Point {
has $.x;
has $.y is rw;
method clear () { $.x = 0; $.y = 0; }
}
my $point = Point.new(x => 2, y => 3);
$a = $point.y; # okay
$point.y = 42; # okay
$b = $point.x; # okay
$point.x = -1; # illegal, default is read-only
$point.clear; # reset to 0,0
If you compare that to how it would have to be written in Perl 5,
you'll note a number of differences:
· It uses the keywords "class" and "method" rather than "package" and
"sub".
· The attributes are named in explicit declarations rather than
implicit hash keys.
· It is impossible to confuse the attribute variables with ordinary
variables because of the extra dot (which also associates the
attributes visually with method calls).
· Perhaps most importantly, we did not have to commit to using a hash
(or any other external data structure) for the object's values.
· We didn't have to write a constructor.
· The implicit constructor automatically knows how to map named
arguments to the attribute names.
· We didn't have to write the accessor methods.
· The accessors are by default read-only outside the class, and you
can't get at the attributes from outside the class without an
accessor. (Inside the class you can use the attributes directly.)
[Update: the ""."" form of the attribute syntax is now construed to
always be a virtual dispatch to the accessor, so you can use that
syntax in any of the child classes too. To refer to the private
storage you now use ""!"" in place of the ""."".]
· The invocant of the "clear" method is implicit.
· And perhaps most obviously, Perl 6 uses "." instead of "->" to
dereference an object.
Now suppose we want to derive from Point, and add a z-axis. That's
just
class Point3d is Point {
has $:z = 123;
method clear () { $:z = 0; next; }
}
my $point3d = Point3d.new(x => 2, y => 3, z => 4);
$c = $point3d.z; # illegal, $:z is invisible
The implicit constructor automatically sorts out the named arguments to
the correct initializers for you. If you omit the z value, it will
default to 123. And the new "clear" method calls the old clear method
merely by invoking "next", without the dodgy "super" semantics that
break down under MI. We also declared the $:z attribute to be
completely private by using a colon instead of a dot. No accessor for
it is visible outside the class. (And yes, OO purists, our other
attributes should probably have been private in the first
place...that's why we're making it just as easy to write a private
attribute as a public one.)
[Update: it's now easier to write a private attribute than a public
one. We've change the ":" twigil to "!", and made it optional, so you
can just declare it as "has $z", or as "has $!z" if you wish to
emphasize the privateness of it.]
If any of that makes your head spin, I'm sure the following will clear
it right up. ":-)"
Classes
A class is what provides a name and a place for the abstract behavior
of a set of objects said to belong to the class.
As in Perl 5, a class is still "just a funny package", structurally
speaking. Syntactically, however, a class is now distinct from a
package or a module. And the body of a class definition now runs in
the context of a metaclass, which is just a way of saying that it has a
metaclass instance as its (undeclared) invocant. (An "invocant" is
what we call the object or class on behalf of which a method is being
called.) Hence class definitions, though apparently declarative, are
also executing code to build the class definition, and the various
declarations within the class are also running bits of code. By
convention classes will use a standard metaclass, but that's just
convention. (A very strong convention, we hope.)
The primary role of a class is to manage instances, that is, objects.
So a class must worry about object creation and destruction, and
everything that happens in between. Classes have a secondary role as
units of software reuse, in that they can be inherited from or
delegated to. However, because this is a secondary role, and because
of weaknesses in models of inheritance, composition, and delegation,
Perl 6 will split out the notion of software reuse into a separate
class-like entity called a "role". Roles are an abstraction mechanism
for use by classes that don't care about the secondary aspects of
software reuse, or that (looking at it the other way) care so much
about it that they want to encapsulate any decisions about
implementation, composition, delegation, and maybe even inheritance.
Sounds fancy, but just think of them as includes of partial classes,
with some safety checks. Roles don't manage objects. They manage
interfaces and other abstract behavior (like default implementations),
and they help classes manage objects. As such, a role may only be
composed into a class or into another role, never inherited from or
delegated to. That's what classes are for.
[Update: in reality things are a little looser than that. If you use a
role as if it were a class, it autoinstantiates an anonymous class for
you that just does that role. If you use a class as if it were a role,
it takes a role-like snapshot of the current state of the class and
freezes it into an anonymous role.]
Classes are arranged in an inheritance hierarchy by their "isa"
relationships. Perl 6 supports multiple inheritance, but makes it easy
to program in a single-inheritance style, insofar as roles make it easy
to mix in (or delegate, or parameterize) private implementation details
that don't belong in the public inheritance tree.
In those cases where MI is used, there can be ambiguities in the
pecking order of classes in different branches. Perl 6 will have a
canonical way to disambiguate these, but by design the dispatch policy
is separable from inheritance, so that you can change the rules for a
given set of classes. (Certainly the rules can change when we call
into another language's class hierarchy, for instance.)
Where possible, class names are treated polymorphically, just as method
names are. This powerful feature makes it possible to inherit systems
of classes in parallel. (These classes might be inner classes, or they
might be inner aliases to outer classes.) By making the class names
"virtual", the base classes can refer to the appropriate derived
classes without knowing their full name. That sounds complicated, but
it just means that if you do the normal thing, Perl will call the right
class instead of the one you thought it was going to call. ":-)"
(As in C++ culture, we use the term "virtual" to denote a method that
dispatched based on the actual run-time type of the object rather than
the declared type of the variable. C++ classes have to declare their
methods to be virtual explicitly. All of Perl's public methods are
virtual implicitly.)
You may derive from any built-in class. For high-level object classes
such as "Int" or "Num" there are no restrictions on how you derive.
For low-level representational classes like "int" or "num", you may not
change the representation of the value; you may only add behaviors.
(If you want to change the representation, you should probably be using
composition instead of inheritance. Or define your own low-level
type.) Apart from this, you don't need to worry about the difference
between "int" and "Int", or "num" and "Num", since Perl 6 will do
autoboxing.
Declaration of Classes
Class declarations may be either file scoped or block scoped. A file-
scoped declaration must be the first thing in the file, and looks like
this:
class Dog is Mammal;
has Limb @.paws;
method walk () { .pawsX.move() }
That has the advantage of avoiding the use of one set of braces,
letting you put everything up against left margin. It is otherwise
identical to a block-scoped class, which looks like this:
class Dog is Mammal {
has Limb @.paws;
method walk () { .pawsX.move() }
}
[Update: that ".paws" would have to be write "@.paws" these days unless
you put the invocant into $_.]
An incomplete class definition makes use of the "..." ("yada, yada,
yada") operator:
class Dog is Mammal {...}
The declaration of a class name introduces the name as a valid bare
identifier or name. In the absence of such a declaration, the name of
a class in an expression must be introduced with the "::" class sigil,
or it will be considered a bareword and rejected, since Perl 6 doesn't
allow barewords. Once the name is declared however, it may be used as
an ordinary term in an expression. Unlike in Perl 5, you should not
view it as a bareword string. Rather, you should view it as a
parameterless subroutine that returns a class object, which
conveniently stringifies to the name of the class for Perl 5
compatibility. But when you say
Dog.new()
the invocant of "new" is an object of type "Class", not a string as in
Perl 5.
[Update: there is no "Class" class now. The "Dog" object is of type
"Dog", the same type as any of its instances. It's just not a real
"Dog". It's an abstract "Dog".]
Unmodified, a class declaration always declares a global name. But if
you prefix it with "our", you're defining an inner class:
class Cell {
our class Golgi {...}
...
}
The full name of the inner class is "Cell::Golgi", and that name can be
used outside of "Cell", since "Golgi" is declared in the "Cell"
package. (Classes may be declared private, however. More later.)
[Update: bare class declarations now always default to "our". The top
level class ends up in the global space merely because files start out
being parsed in the "*" package (aka "GLOBAL::"), so that's where the
first class declaration puts its name.]
Class traits
A class declaration may apply various traits to the class. (A trait is
a property applied at compile time.) When you apply a trait, you're
accepting whatever it is that that trait does to your class, which
could be pretty much anything. Traits do things to classes. Do not
confuse traits with roles, which are sworn to play a subservient role
to the class. Traits can do whatever they jolly well please to your
class's metadata.
Now, the usual thing to do to a class's metadata is to insert another
class into its ISA metadata. So we use trait notation to install a
superclass:
class Dog is Mammal {...}
To specify multiple inheritance, just add another trait:
class Dog is Mammal is Pet {...}
But often you'll want a role instead, specified with "does":
class Dog is Mammal does Pet {...}
More on that later. But remember that traits are evil. You can have
traits like:
class Moose is Mammal is stuffed is really(Hatrack) is spy(Russian) {...}
So what if you actually want to derive from "stuffed"? That's a good
question, which we will answer later. (The short answer is, you
don't.)
Now as it happens, you can also use "is" from within the class. You
can also put the "does" inside to include various roles:
class Dog {
is Mammal;
does Pet;
does Servant;
does Best::Friend[Man];
does Drool;
...
}
In fact, there's no particular reason to put any of these outside the
braces except to make them more obvious to the casual reader. If we
take the view that inheritance is just one form of implementation, then
a simple
class Dog {...}
is sufficient to establish that there's a "Dog" class defined out there
somewhere. We shouldn't really care about the implementation of "Dog",
only its interface--which is usually pretty slobbery.
That being said, you can know more about the interface at compile time
once you know the inheritance, so it's good to have pulled in a
definition of the class as well as a declaration. Since this is
typically done with "use", the inheritance tree is generally available
even if you don't mark your class declaration externally with the
inheritance. (But in any event, the actual inheritance tree doesn't
have to be available till run time, since that's when methods are
dispatched. (Though as is often the case, certain optimizations work
better when you give them more data earlier...))
Use of Classes
A class is used directly by calling class methods, and indirectly by
calling methods of an object of that class (or of a derived class that
doesn't override the methods in question).
Classes may also be used as objects in their own right, as instances of
a metaclass, the class "MetaClass" by default. When you declare class
"Dog", you're actually calling a metaclass class method that constructs
a metaclass instance (i.e. the "Dog" class) and then calls the
associated closure (i.e the body of the class) as a method on the
instance. (With a little grammatical magic thrown in so that "Dog"
isn't considered a bareword.)
The class "Dog" is an instance of the class "MetaClass", but it's also
an instance of the type "Class" when you're thinking of it as a
dispatcher. That is, a class object is really allomorphic. If you
treat one as an instance of Class, it behaves as if it were the user's
view of the class, and the user thinks the class is there only to
dispatch to the user's own class and instance methods. If, however,
you treat the object as an instance of "MetaClass", you get access to
all its metaclass methods rather than the user-defined methods.
Another way to look at it is that the metaclass object is a separate
object that manages the class object. In any event, you can get from
the ordinary class object to its corresponding metaclass object via the
".meta" method, which every object supports.
[Update: every "Dog" object (class or instance) just points to its
metaclass instance. The "Dog" class object isn't either a "Class" or a
"MetaClass" object. Perhaps it does a "Class" role though.]
By the way, a "Class" is a "Module" which in turn is a "Package" which
in turn is an "Object". Or something like that. So a class can always
be used as if it were a mere module or package. But modules and
packages don't have a ".dispatch" method...
[Update: the preceding paragraph is really only true if you take it as
talking about the metaclasses, not the classes. A "Dog" is not a
"Module". "Dog.meta" is a metaclass instance that can do modular and
package operations as well as class operations, and all these
metaobjects use packages as their global storage.]
By default, classes in Perl are left open. That is, you can add more
methods later. (However, an application may close them.) For
discussion of this, see the section on "Open vs Closed Classes".
Class Name Semantics
Class names (and module names) are just package names.
Unlike in Perl 5, when you mention a package name in Perl 6 it doesn't
always mean a global name, since Perl 6 knows about inner classes and
lexically scoped packages and such. As with other entities in Perl
such as variables and methods, a scan is made for who thinks they have
the best definition of the name, going out from lexical scopes to
package scope to global scope in the case of static class names, and
via method inheritance rules in the case of virtual class names.
Note that "::MyClass" and "MyClass" mean the same thing. In Perl 6, an
initial "::" is merely an optional sigil for when the name of the
package would be misconstrued as something else. It specifically does
not mean (as it does in Perl 5) that it is a top-level package. To
refer to the top-level package, you would need to say something like
"::*MyClass" (or just *MyClass in places where the "*" unary operator
would not be expected.) But also note that the "*" package in Perl is
not the ""main"" package in the Perl 5 sense.
Likewise, the presence of "::" within a package name like "Fish::Carp"
does not make it a global package name necessarily. Again, it scans
out through various scopes, and only if no local scopes define package
"Fish::Carp" do you get the global definition. And again, you can
force it by saying "::*Fish::Carp". (Or just *Fish::Carp in places
where the "*" unary operator is not expected.) "GLOBAL::Fish::Carp"
means the same thing.
You can interpolate a parenthesized expression within a package name
after any "::". So these are all legal package names (or module names,
or class names):
::($alice)
::($alice)::($bob)
::($alice::($bob))
::*::($alice)::Bob
::('*')::($alice ~ '_misc')::Bob
::(get_my_dir())
::(@multilevel)
And any of those package names could be part of a variable or sub name:
$::($alice)::name
@::($alice)::($bob)::elems[1,2,3]
%::*::($alice)::Bob::map{'xyz'}
&::('*')::($alice ~ '_misc')::Bob::doit(1,2,3)
$::(get_my_dir())::x
$::(@multilevel)
Note in the last example that the final element of @multilevel is taken
to be the variable name. This may be illegal under "use strict refs",
since it amounts to a symbolic reference. (Not that the others aren't
symbolic, but the rules may be looser for package names than for
variable names, depending on how strict our strictures get.)
[Update: There is no "strict refs" anymore, since we have a separate
syntax for when we explicitly want symbolic references.]
Private Classes
A class named with a single initial colon is a private class name:
class :MyPrivateClass {...}
It is completely ignored outside of the current class. Since the name
is useful only in the current package, it makes no sense to try to
qualify it with a package name. While it's an inner class of sorts, it
does not override any class name from any other class because it lives
in its own namespace (a subnamespace of the current package), and
there's no way to tell if the class you're deriving from declares its
own private class of the same name (apart from digging through the
reflection interfaces).
The colon is orthogonal to the scoping. What's actually going on in
this example is that the name is stored in the package with the leading
colon, because the colon is part of the name. But if you declared ""my
class :Golgi"" the private name would go into the lexical namespace
with the colon. The colon functions a bit like a "private" trait, but
isn't really a trait. Wherever you might use a private name, the colon
in the name effectively creates a private subspace of names, just as if
you'd prefixed it with "_" in the good old days.
But if were only that, it would just be encapsulation by convention.
We're trying to do a little better than that. So the language needs to
actively prevent people from accessing that private subspace from
outside the class. You might think that that's going to slow down all
the dispatchers, but probably not. The ordinary dispatch of
"Class.method" and "$obj.method" don't have to worry about it, because
they use bare identifiers. It's only when people start doing
"::($class)" or "$obj.$method" that we have to trap illegal references
to colonic names.
Even though the initial colon isn't really a trait, if you interrogate
the "".private"" property of the class, it will return true. You don't
have to parse the name to get that info.
We'll make more of this when we talk about private methods and
attributes. Speaking of methods...
[Update: Private attributes are now marked with "!" rather than ":",
but nowadays you should just use a lexically scoped class if you want a
private class.]
Methods
Methods are the actions that a class knows how to invoke on behalf of
an object of that type (or on behalf of itself, as a class object).
But you knew that already.
As in Perl 5, a method is still "just a funny subroutine", but in Perl
6 we use a different keyword to declare it, both because it's better
documentation, and because it captures the metadata for the class at
compile time. Ordinary methods may be declared only within the scope
of a class definition. (Multimethods are exempt from this restriction,
however.)
Declaration of Methods
To declare a method, use the "method" keyword just as you would use
"sub" for an ordinary subroutine. The declaration is otherwise almost
identical:
method doit ($a, $b, $c) { ... }
The one other difference is that a method has an invocant on behalf of
which the method is called. In the declaration above, that invocant is
implicit. (It is implicitly typed to be the same as the current
surrounding class definition.) You may, however, explicitly declare
the invocant as the first argument. The declaration knows you're doing
that because you put a colon between the invocant and the rest of the
arguments:
method doit ($self: $a, $b, $c) { ... }
In this case, we didn't specify the type of $self, so it's an untyped
variable. To make the exact equivalent of the implicit declaration,
put the current class:
method doit (MyClass $self: $a, $b, $c) { ... }
or more generically using the "::_" "current class" pronoun:
method doit (::_ $self: $a, $b, $c) { ... }
[Update: The current lexical class is now named via "::?CLASS". To
capture the actual class of the invocant, use a type variable like
"::RealClass" in the declaration, with or without the $self.]
In any case, the method sets the current invocant as the topic, which
is also known as the $_ variable. However, the topic can change
depending on the code inside the method. So you might want to declare
an explicit invocant when the meaning of $_ might change. (For further
discussion of topics see Apocalypse 4. For a small writeup on sub
signatures see Apocalypse 6.)
[Update: methods do not set the topic now unless you declare the
invocant with the name $_.]
A private method is declared with a colon on the front:
method :think (Brain $self: $thought)
Private methods are callable only by the class itself, and by trusted
"friends". More about that when we talk about attributes.
[Update: private methods are marked with "!" now rather than ":".]
Use of Methods
As in Perl 5, there are two notations for calling ordinary methods.
They are called the "dot" notation and the "indirect object" notation.
The dot notation
Perl 6's "dot" notation is just the industry-standard way to call a
method these days. (This used to be "->" in Perl 5.)
$object.doit("a", "b", "c");
If the object in question is the current topic, $_, then you can use
the unary form of the dot operator:
for @objects {
.doit("a", "b", "c");
}
A simple variable may be used for an indirectly named method:
my $dosomething = "doit";
$object.$dosomething("a", "b", "c");
As in Perl 5, if you want to do anything fancier, use a temporary
variable.
The parentheses may also be omitted when the following code is
unambiguously a term or operator, so you can write things like this:
@thumbs.each { .twiddle } # same as @thumbs.each({.twiddle})
$thumb.twiddle + 1 # same as $thumb.twiddle() + 1
.mode 1 # same as $_.mode(1)
[Update: This is retracted. Parens are always required if there are
arguments.]
[Update: As an alternative to parens, you can turn any otherwise legal
method call into a list operator by appending a ":".]
(Parens are always required around the argument list when a method call
with arguments is interpolated into a string.)
The parser will make use of whitespace at this point to decide some
things. For instance
$obj.method + 1
is obviously a method with no arguments, while
$obj.method +1
is obviously a method with an argument. However, the dwimmery only
goes as far as the typical person's visual intuition. Any construct
too ambiguous is simply rejected. So
$obj.method+1
produces a parse error.
[Update: The preceding is also retracted. None of those would be
interpreted as arguments now. However, most of the following is still
true.]
In particular, curlies, brackets, or parens would be interpreted as
postfix subscripts or argument lists if you leave out the space. In
other words, Perl 6 distinguishes:
$obj.method ($x + $y) + $z # means $obj.method(($x + $y) + $z)
[Update: that's illegal now.]
from
$obj.method($x + $y) + $z # means ($obj.method($x + $y)) + $z
Yes, this is different from Perl 5. And yes, I know certain people
hate it. They can write their own grammar.
While it's always possible to disambiguate with parentheses, sometimes
that is just too unsightly. Many methods want to be parsed as if they
were list operators. So as an alternative to parenthesizing the entire
argument list, you can disambiguate by putting a colon between the
method call and the argument list:
@thumbs.each: { .twiddle } # same as @thumbs.each({.twiddle})
$thumb.twiddle: + 1 # same as $thumb.twiddle(+ 1)
.mode: 1 # same as $_.mode(1)
$obj.for: 1,2,3 -> $i { ... }
[Update: There is no colon disambiguator any more. Use parens if there
are arguments. (However, you can pass an adverbial block using ":{}"
notation with a null key. That does not count as an ordinary
argument.)]
[Update: The colon disambiguator is back again. The problem with
precedence is now solved by requiring that a bare closure in the list
is always interpreted as the final argument unless followed by a comma
or comma surrogate.]
If a method is declared with the trait ""is rw"", it's an lvalue
method, and you can assign to it just as if it were an ordinary
variable:
method mystate is rw () { return $:secretstate }
$object.mystate = 42;
print $object.mystate; # prints 42
In fact, it's a general rule that you can use an argumentless ""rw""
method call anywhere you might use a variable:
temp $state.pi = 3;
$tailref = \$fido.tail;
(Though occasionally you might need to supply parentheses to
disambiguate, since the compiler can't always know at compile time
whether the method has any arguments.)
[Update: the parens (or a colon) are required if there are arguments.]
Method calls on container objects are obviously directed to the
container object itself, not to the contents of the container:
$elems = @array.elems;
@keys = %hash.keys;
$sig = &sub.signature;
However, with scalar variables, methods are always directed to the
object pointed to by the reference contained in the scalar:
$scalar = @array; # (implied \ in scalar context)
$elems = $scalar.elems; # returns @array.elems
or for value types, the appropriate class is called as if the value
were a reference to a "real" object.
$scalar = "foo";
$chars = $scalar.chars; # calls Str::chars or some such
In order to talk to the scalar container itself, use the "tied()"
pseudo-function as you would in Perl 5:
if tied($scalar).constant {...}
(You may recall, however, that in Perl 6 it's illegal to tie any
variable without first declaring it as tyable, or (preferably) tying it
directly in the variable's declaration. Otherwise the optimizer would
have to assume that every variable has semantics that are unknowable in
advance, and we would have to call it a pessimizer rather than an
optimizer.)
[Update: the function is now called "variable()" instead of tied() to
avoid that confusion.]
The "indirect object" notation
The other form of method call is known as the "indirect object" syntax,
although it differs from Perl 5's syntax in that a colon is required
between the indirect object (the invocant) and its arguments:
doit $object: "a", "b", "c"
The colon may be omitted if there are no arguments (besides the
invocant):
twiddle $thumb;
$x = new X;
Note that indirect object calls may not be directly interpolated into a
string, since they don't start with a sigil. You can always use the
"$()" expression interpolater though:
say "$(greet $lang), world!";
[Update: The "$()" notation is gone. Use
say "{greet $lang}, world!";
instead.]
As in Perl 5, the indirect object syntax is valid only if you haven't
declared a subroutine locally that overrides the method lookup. That
was a bit of a problem in Perl 5 since, if there happened to be a "new"
constructor in your class, it would call that instead dispatching to
the class you wanted it to. That's much less of a problem in Perl 6,
however, because Perl 6 cannot confuse a method declaration with a
subroutine declaration. (Which is yet another reason for giving
methods their own keyword.)
Another factor that makes indirect objects work better in Perl 6 is
that the class name in ""new X"" is a predeclared object, not a bare
identifier. (Perl 5 just had to guess when it saw two bare identifiers
in a row that you were trying to call a class method.)
The indirect object syntax may not be used with a variable for the
methodname. You must use dot notation for that.
Because of precedence, the indirect object notation may not be used as
an lvalue unless you parenthesize it:
(mystate $object) = 42;
(findtail Dog: "fido") = Wagging::on;
You may parenthesize an argumentless indirect object method to make it
look like a function:
mystate($object) = 42;
twiddle($thumb);
The dispatch rules for methods and global multi subs conspire to keep
these unambiguous, so the user really doesn't have to worry about
whether
close($handle);
is implemented as a global multi sub or a method on a $handle object.
In essence, the multimethod dispatching rules degenerate to ordinary
method dispatch when there are no extra arguments to consider (and
sometimes even when there are arguments). This is particularly
important because Perl uses these rules to tell the difference between
print "Howdy, world!\n"; # global multi sub
and
print $*OUT; # ordinary filehandle method
However, you must still put the colon after the invocant if there are
other arguments. The colon tells the parser whether to look for the
arguments inside:
doit($object: "a", "b", "c")
or outside:
doit($object): "a", "b", "c"
If you do say
doit($object, "a", "b", "c")
the first comma forces it to be interpreted as a sub call rather than a
method call.
(We could have decided to say that whenever Perl can't find a "doit()"
sub definition at run time, it should assume you meant the entire
parenthesized list to be the indirect object, which, since it's in
scalar context would automatically generate a list reference and call
"[$object,"a","b","c"].doit()", which is unlikely to be what you mean,
or even work. (Unless, of course, that's how you really meant it to
work.) But I think it's much more straightforward to simply disallow
comma lists at the top level of an indirect object. The old "if it
looks like a function" rule applies here. Oddly, though, function
syntax is how you call multisubs in Perl 6. And as it happens, the way
the multisub/multimethod dispatch rules are defined, it could still end
up calling "$object.doit("a", "b", "c")" if that is deemed to be the
best choice among all the candidates. But syntactically, it's not an
indirect object. More on dispatch rule interactions later.)
The comma still doesn't work if you go the other way and leave out the
parens entirely, since
doit $object, "a", "b", "c";
would always (in the absence of a prior sub declaration) be parsed as
(doit $object:), "a", "b", "c";
So a print with both an indirect object and arguments has to look like
one of these:
print $*OUT: "Howdy, world!\n";
print($*OUT: "Howdy, world!\n");
print($*OUT): "Howdy, world!\n";
Note that the old Perl 5 form using curlies:
print {some_hairy_expression()} "Howdy, world!\n";
should instead now be written with parentheses:
print (some_hairy_expression()): "Howdy, world!\n";
though, in fact, in this case the parens are unnecessary:
print some_hairy_expression(): "Howdy, world!\n";
You'd only need the parens if the invocant expression contained
operators lower in precedence than comma (comma itself not being
allowed). Basically, if it looks confusing to you, you can expect it
to look confusing to the compiler, and to make the compiler look
confused. But it's a feature for the compiler to look confused when it
actually is confused. (In Perl 5 this was not always so.)
Note that the disambiguating colon associates with the closest method
call, whether direct or indirect. So
print $obj.meth: "Howdy, world!\n";
passes ""Howdy, world!\n"" to "$obj.meth" rather than to "print".
That's a case where you ought to have parenthesized the indirect object
for clarity anyway:
print ($obj.meth): "Howdy, world!\n";
Calling private methods
A private method does not participate in normal method dispatch. It is
not listed in the class's public methods. The ".can" method does not
see it. Calling it via normal dispatch raises a "no such method"
exception. It is, in essence, invisible to the outside world. It does
not hide a base class's method of the same name--even in the current
class! It's fair to ask for warnings about name collisions, of course.
But we're not following the C++ approach of making private methods
visible but uncallable, because that would violate encapsulation, and
in particular, Liskov substitutability
<http://en.wikipedia.org/wiki/Liskov_substitution_principle>. Instead,
we separate the namespaces completely by distinguishing the public dot
operator from the private dot-colon operator. That is:
$mouth.say("Yes!") # always calls public .say method
.say("Yes!") # unary form
$brain.:think("No!") # always calls private :think method
.:think("No!") # unary form
[Update: we now use "!" instead of ".:" and there is no unary form.]
The inclusion of the colon prevents any kind of "virtual" behavior.
Calling a private method is illegal except under two very specific
conditions. You can call a private method ":think" on an object $brain
only if:
1. The class of $brain is explicitly declared, and the declared class
is either the class definition that we are in or a class that has
explicitly granted trust to our current class, and the declared
class contains a private ":think" method. Or...
2. The class of the $brain is not declared, and the current class
contains a private ":think" method.
The upshot of these rules is that a private method call is essentially
a subroutine call with a method-like syntax. But the private method
we're going to call can be determined at compile time, just like a
subroutine.
Class Methods
Class methods are called on the class as a whole rather than on any
particular instance object of the class. They are distinguished from
ordinary methods only by the declared type of the invocant. Since an
implicit invocant would be typed as an object of the class and not as
the class itself, the invocant declaration is not optional in a class
method declaration if you wish to specify the type of the invocant.
(Untyped explicit invocants are allowed to "squint", however.)
[Update: squinting is no longer necessary now that we've defined class
objects to merely be uninstantiated prototypes of the same type as the
instances in the class.]
Class Invocant
To declare an ordinary class method, such as a constructor, you say
something like:
method new (Class $class: *@args) { ... }
Such a method may only be called with an invocant that "isa" "Class",
that is, an object of type "Class", or derived from type "Class".
[Update: no Class class exists.]
Class|object Invocant
It is possible to write a method that can be called with an invocant
that is either a "Class" or an object of that current class. You can
declare the method with a type junction:
method new (Class|Dog $classorobj: *@args) { ... }
Or to be completely non-specific, you can leave out the type entirely:
method new ($something: *@args) { ... }
That's not as dangerous as it looks, since almost by definition the
dispatcher only calls methods that are consistent with the inheritance
tree. You just can't say:
method new (*@args) { ... }
which would be the equivalent of
method new (Dog $_: *@args) { ... }
Well, actually, you could say that, but it would require that you have
an existing "Dog"-compatible object in order to create a new one. And
that could present a little bootstrapping problem...
[Update: we simply made "Dog" a "Dog"-compatible object.]
(Though it could certainly cure the boot chewing problem...)
But in fact, you'll rarely need to declare "new" method at all, because
Perl supplies a default constructor to go with your class.
Submethods
Some methods are intended to be inherited by derived classes. Others
are intended to be reimplemented in every class, or in every class that
doesn't want the default method. We call these "submethods", because
they work a little like subs, and a little like methods. (You can also
read the "sub" with the meaning it has in words like "subhuman".)
Typically these are (sub)methods related to the details of construction
and destruction of the object. So when you call a constructor, for
instance, it ends up calling the "BUILDALL" initialization routine for
the class, which ends up calling the "BUILD" submethod:
submethod BUILD ($a, $b, $c) {
$.a = $a;
$.b = $b;
$.c = $c;
}
Since the submethod is doing things that make sense only in the context
of the current class (such as initializing attributes), it makes no
sense for "BUILD" to be inherited. Likewise "DESTROY" is also a
submethod.
Why not just make them ordinary subs, then? Ordinary subs can't be
called by method invocation, and we want to call these routines that
way. Furthermore, if your base class does define an ordinary method
named "BUILD" or "DESTROY", it can serve as the default "BUILD" or
"DESTROY" for all derived classes that don't declare their own
submethods. (All public methods are virtual in Perl, but some are more
virtual than others.)
You might be saying to yourself, "Wait, private methods aren't virtual.
Why not just use a private method for this?" It's true that private
methods aren't virtual, because they aren't in fact methods at all.
They're just ordinary subroutines in disguise. They have nothing to do
with inheritance. By contrast, submethods are all about presenting a
unified inherited interface with the option of either inheriting or not
inheriting the implementation of that interface, at the discretion of
the class doing the implementing.
So the bottom line is that submethods allow you to override an
inherited implementation for the current class without overriding the
default implementation for other classes. But in any case, it's still
using a public interface, called as an ordinary method call, from
anywhere in your program that has an object of your type.
Or a class of your type. The default "new" constructor is an ordinary
class method in class "Object", so it's inherited by all classes that
don't define their own "new". But when you write your own "new", you
need to decide whether your constructor should be inherited or not. If
so, that's good, and you should declare it as a method. But if not,
you should declare it as a submethod so that derived classes don't try
to use it erroneously instead of the default "Object.new()".
Attributes
In Perl 6, "attributes" are what we call the instance variables of an
object. (We used that word to mean something else in Perl 5--we're now
calling those things "traits" or "properties".)
As with classes and methods, attribute declarations are apparently
declarative. Underneath they actually call a method in the metaclass
to install the new definition. The Perl 6 implementation of attributes
is not based on a hash, but on something more like a symbol table.
Attributes are stored in an opaque datatype rather like a struct in C,
or an array in Perl 5--but you don't know that. The datatype is opaque
in the sense that you shouldn't care how it's laid out in memory
(unless you have to interface with an outside data structure--like a C
struct). Do not confuse opacity with encapsulation. Encapsulation
only hides the object's implementation from the outside world. But the
object's structure is opaque even to the class that defines it.
One of the large benefits of this is that you can actually take a C or
C++ data structure and wrap accessor methods around it without having
to copy anything into a different data structure. This should speed up
things like XML parsing.
Declaration of Attributes
In order to provide this opaque abstraction layer, attributes are not
declared as a part of any other data structure. Instead, they are
modeled on real variables, whose storage details are implicitly
delegated to the scope in which they are declared. So attributes are
declared as if they were normal variables, but with a strange scope and
lifetime that is neither "my" nor "our". (That scope is, of course,
the current object, and the variable lives as long as the object
lasts.) The class will implicitly store those attributes in a location
distinct from any other class's attributes of the same name, including
any base or derived class. To declare an attribute variable, declare
it within the class definition as you would a "my" variable, but use
the "has" declarator instead of "my":
class Dog is Mammal {
has $.tail;
has @.legs;
...
}
The "has" declarator was chosen to remind people that attributes are in
a "HASA" relationship to the object rather than an "ISA" relationship.
The other difference from normal variables is that attributes have a
secondary sigil that indicates that they are associated with methods.
When you declare an attribute like "$.tail", you're also implicitly
declaring an accessor method of the same name, only without the "$" on
the front. The dot is there to remind you that it's also a method
call.
As with other declarations, you may add various traits to an attribute:
has $.dogtag is rw;
If you want all your attributes to default to ""rw"", you can put the
attribute on the class itself:
class Coordinates is rw {
has int $.x;
has int $.y;
has int $.z;
}
Essentially, it's now a C-style struct, without having to introduce an
ugly word like "struct" into the language. Take that, C++. ":-)"
You can also assign to a declaration:
has $.master = "TheDamian";
Well, actually, this looks like an assignment, but it isn't. The
effect of this is to establish a default; it is not executed at run
time. (Or more precisely, it runs when the class closure is executed
by the metaclass, so it gets evaluated only once and the value is
stored for later use by real instances. More below.)
Use of Attributes
The attribute behaves just like an ordinary variable within the class's
instance methods. You can read and write the attributes just like
ordinary variables. (It is, however, illegal to refer to an instance
attribute variable (that is, a ""has"" variable) from within a class
method. Class methods may only access class attributes, not instance
attributes. See below.)
[Update: class methods are no longer distinguished from instance
methods by the compiler, except insofar as the compiler can tell that a
method refers to instance variables in the body. It's a run-time error
to ask for an attribute of a class object, but that's because the class
object doesn't have the attribute, not because it's a class object.]
Bare attributes are automatically hidden from the outside world because
their sigiled names cannot be seen outside the class's package. This
is how Perl 6 enforces encapsulation. Outside the class the only way
to talk about an attribute is through accessor methods. Since public
methods are always virtual in Perl, this makes attribute access virtual
outside the class. Always. (Unless you give the optimizer enough
hints to optimize the class to "final". More on that later.)
In other words, only the class itself is allowed to know whether this
attribute is, in fact, implemented by this class. The class may also
choose to ignore that fact, and call the abstract interface, that is,
the accessor method, in which case it might actually end up calling
some derived class's overriding method, which might in turn call back
to this class's accessor as a super method. (So in general, an
accessor method should always refer to its actual variable name rather
than the accessor method name to avoid infinite recursion.)
[Update: all public $.foo variables are now virtual, and the actual
private storage is represented by $!foo.]
You may write your own accessor methods around the bare attributes, but
if you don't, Perl will generate them for you based on the declaration
of the attribute variable. The traits of the generated method
correspond directly to the traits on the variable.
By default, a generated accessor is read-only (because by default any
method is read-only). If you mark an attribute with the trait ""is
rw"" though, the corresponding generated accessor will also be marked
""is rw"", meaning that it can be used as an lvalue.
In any event, even without ""is rw"" the attribute variable is always
writable within the class itself (unless you apply the trait "is
constant" to it).
As with private classes and methods, attributes are declared private
using a colon on the front of their names. As with any private method,
a private accessor is completely ignored outside its class (or, by
extension, the classes trusted by this class).
To carry the separate namespace idea through, we incorporate the colon
as the secondary sigil in declarations of private attributes:
has $:x;
Then we can get rid of the verbose "is private" altogether. Well, it's
still there as a trait, but the colon implies it, and is required
anyway.) And we basically force people to document the private/public
distinction every place they reference $:x instead of "$.x", or
"$obj.:meth" instead of "$obj.meth".
We've seen secondary sigils before in earlier Apocalypses. In each
case they're associated with a bizarre usage of some sort. So far we
have:
$*foo # a truly global global (in every package)
$?foo # a regex-scoped variable
$^foo # an autodeclared parameter variable
$.foo # a public attribute
$:foo # a private attribute
[Update: A regex-scoped variable now looks like "$<foo>" instead. A
"$?foo" variable is now a compiler variable, and a "$=foo" variable is
a POD variable. A private attribute is "$!foo". By the way, lately
we've been calling the secondary sigils "twigils".]
As a form of the dreaded "Hungarian notation", secondary sigils are not
introduced lightly. We define secondary sigils only where we deem
instant recognizability to be crucial for readability. Just as you
should never have to look at a variable and guess whether it's a true
global, you should never have to look at a method and guess which
variables are attributes and which ones are variables you just happen
to be in the lexical scope of. Or which attributes are public and
which are private. In Perl 6 it's always obvious--at the cost of a
secondary sigil.
We do hereby solemnly swear to never, never, ever add tertiary sigils.
You have been warned.
Default Values
You can set default values on attributes by pseudo-assignment to the
attribute declaration:
has Answer $.ans = 42;
These default values are associated as ""build"" traits of the
attribute declaration object. When the "BUILD" submethod is
initializing a new object, these prototype values are used for
uninitialized attributes. The expression on the right is evaluated
immediately at the point of declaration, but you can defer evaluation
by passing a closure, which will automatically be evaluated at the
actual initialization time. (Therefore, to initialize to a closure
value, you have to put a closure in a closure.)
Here's the difference between those three approaches. Suppose you say:
class Hitchhiker {
my $defaultanswer = 0;
has $.ans1 = $defaultanswer;
has $.ans2 = { $defaultanswer };
has $.ans3 = { { $defaultanswer } };
$defaultanswer = 42;
...
}
When the object is eventually constructed, "$.ans1" will be initialized
to 0, while "$.ans2" will be initialized to 42. (That's because the
closure binds $defaultanswer to the current variable, which still
presumably has the value 42 by the time the "BUILD" routine initializes
the new object, even though the lexical variable "$defaultanswer" has
supposedly gone out of scope by the time the object is being
constructed. That's just how closures work.)
And "$.ans3" will be initialized not to 42, but to a closure that, if
you ever call it, will also return 42. So since the accessor
"$obj.ans3()" returns that closure, "$obj.ans3().()" will return 42.
The default value is actually stored under the ""build"" trait, so
this:
has $.x = calc($y);
is equivalent to this:
has $.x is build( calc($y) );
and this:
has $.x = { calc($y) };
is equivalent to either of these:
has $.x is build( { calc($y) } );
has $.x will build { calc($y) };
As with all closure-valued container traits, the container being
declared (the "$.x" variable in this case) is passed as the topic to
the closure (in addition to being the target that will be initialized
with the result of the closure, because that's what "build" does). In
addition to the magical topic, these build traits are also magically
passed the same named arguments that are passed to the "BUILD" routine.
So you could say
has $.x = { calc($^y) };
to do a calculation based on the ":y(582)" parameter originally passed
to the constructor. Or rather, that will be passed to the constructor
someday when the object is eventually constructed. Remember we're
really still at class construction time here.
As with other initializers, you can be more specific about the time at
which the default value is constructed, as long as that time is earlier
than class construction time:
has $.x = BEGIN { calc() }
has $.x = CHECK { calc() }
has $.x = INIT { calc() }
has $.x = FIRST { calc() }
has $.x = ENTER { calc() }
which are really just short for:
has $.x is build( BEGIN { calc() } )
has $.x is build( CHECK { calc() } )
has $.x is build( INIT { calc() } )
has $.x is build( FIRST { calc() } )
has $.x is build( ENTER { calc() } )
Class Attributes
In general, class attributes are just package or lexical variables. If
you define a package variable with a dot or colon, it autogenerates an
accessor for you just as it does for an ordinary attribute:
our $.count; # generates a public read-only .count accessor
our %:cache is rw; # generates a private read-write .:cache accessor
The implicit invocant of these implicit accessors has a "squinting"
type--it can either be the class or an object of the class. (Declare
your own accessors if you have a philosophical reason for forcing the
type one way or the other.)
[Update: squinting no longer necessary.]
The disadvantage of using ""our"" above is that that both of these are
accessible from outside the class via their package name (though the
private one is Officially Ignored, and cannot be named simply by saying
%MyClass:::cache because that syntax is specifically disallowed).
If on the other hand you declare your class variables lexically:
my $.count; # generates a read-only .count accessor
my %:cache is rw; # generates a read-write .:cache accessor
then the same pair of accessors are generated, but the variables
themselves are visible only within the class block. If you reopen the
class in another block, you can only see the accessors, not the bare
variables. This is probably a feature.
Generally speaking, though, unless you want to provide public accessors
for your class attributes, it's best to just declare them as ordinary
variables (either "my" or "state" variables) to prevent confusion with
instance attributes. It's a good policy not to declare any public
accessors until you know you need them. They are, after all, part of
your contract with the outside world, and the outside world has a way
of holding you to your contracts.
Object Construction
The basic idea here is to remove the drudgery of creating objects. In
addition we want object creation and cleanup to work right by default.
In Perl 5 it's possible to make recursive construction and destruction
work, but it's not the default, and it's not easy.
Perl 5 also confused the notions of constructor and initializer. A
constructor should create a new object once, then call all the
appropriate initializers in the inheritance tree without recreating the
object. The initializer for a base class should be called before the
initializer for any class derived from it.
The initializer for a class is always named "BUILD". It's in uppercase
because it's usually called automatically for you at construction time.
As with Perl 5, a constructor is only named ""new"" by convention, and
you can write a constructor with any name you like. However, in Perl
6, if you do not supply a ""new"" method, a generic one will be
provided (by inheritance from "Object", as it happens).
The Default Constructor
The default "new" constructor looks like this:
multi method new (Class $class: *%_) {
return $class.bless(0, *%_);
}
[Update: now that we have parametric types it's more like:
method new (::T: *%_) {
return ::T.bless(*%_);
}
That binds the class type of the invocant regardless of how
instantiated the invocant is. Also, the candidate argument need not be
supplied.]
The arguments for the default constructor are always named arguments,
hence the *%_ declaration to collect all those pairs and pass them on
to bless.
You'll note also that "bless" is no longer a subroutine but a method
call, so it's now impossible to omit the class specification. This
makes it easier to inherit constructors. You can still bless any
reference you could bless in Perl 5, but where you previously used a
function to do that:
# Perl 5 code...
return bless( {attr => "hi"}, $class );
in Perl 6 you use a method call:
# Perl 6 code...
return $class.bless( {attr => "hi"} );
However, if what you pass as the first argument isn't a reference,
"bless" is going to construct an opaque object and initialize it. In a
sense, "bless" is the only real constructor in Perl 6. It first makes
sure the data structure is created. If you don't supply a reference to
bless, it calls "CREATE" to create the object. Then it calls
"BUILDALL" to call all the initializers.
The signature of "bless" is something like:
method bless ($class: $candidate, *%_)
The 0 candidate indicates the built-in opaque type. If you're really
strange in the head, you can think of the "0" as standing for
""0paque"". Or it's the "zero" object, about which we know zip.
Whatever tilts your windmill...
In any event, strings are reserved for other object layouts. We could
conceivably have things like:
return $class.bless("Cstruct", *%_);
So as it happens, 0 is short for the layout "P6opaque".
[Update: There is no 0 argument. Just leave it out if you wish to
declare a "P6opaque" object. If you wish to pass some other
representation name to "CREATE", just call "CREATE" first and pass the
result of that as your candidate to "bless",]
Any additional arguments to ".bless" are automatically passed on to
"CREATE" and "BUILDALL". But note that these must be named arguments.
It could be argued that the only real purpose for writing a ".new"
constructor in Perl 6 is to translate different positional argument
signatures into a unified set of named arguments. Any other
initialization common to all constructors should be done within
"BUILD".
Oh, the invocant of ".bless" is either a class or an object of the
class, but if you use an object of the class, the contents of that
object are not automatically used to prototype the new object. If you
wish to do that, you have to do it explicitly by copying the
attributes:
$obj.bless(0, *%$obj)
(That is just a specific application of the general principle that if
you treat any object like a hash, it will behave like one, to the
extent that it can. That is, %$obj turns the attributes into key/value
pairs, and passes those as arguments to initialize the new object.
Note that %$obj includes the private attributes when used inside the
class, but not outside.)
Just because ".bless" allows an object to be used for a class doesn't
mean your "new" constructor has to do the same. Some folks have
philosophical issues with mixing up classes and objects, and it's fine
to disallow that on the constructor level. In fact, you'll note that
the default ".new" above requires a "Class" as its invocant. Unless
you override it, it doesn't allow an object for the constructor
invocant. Go thou and don't likewise.
[Update: Sorry, the new "::T" parametric type syntax doesn't care
whether the invocant is a class or an object. It does, however,
document that you're only interested in the type of the invocant.]
The Default Cloner
Another good reason not to overload ".new" to do cloning is that Perl
will also supply a default ".clone" routine that works something like
this:
multi method clone ($proto: *%_) {
return $proto.bless(0, *%_, *%$proto);
}
Note the order of the two hash arguments to "bless". This gives the
supplied attribute values precedence over the copied attribute values,
so that you can change some of the attributes en passant if you like.
That's because we're passing the two flattened hashes as arguments to
".bless" and Perl 6's named argument binding mechanism always picks the
first argument that matches, not the last. This is opposite of what
happens when you use the Perl 5 idiom:
%newvals = (%_, %$proto);
In that case, the last value (the one in %$proto) would "win".
CREATE
submethod CREATE ($self: *%args) {...}
"CREATE" is called when you don't want to use an existing data
structure as the candidate for your object. In general you won't
define "CREATE" because the default "CREATE" does all the heavy magic
to bring an opaque object into existence. But if you don't want an
opaque object, and you don't care to write all your constructors to
create the data structure before calling ".bless", you can define your
own "CREATE" submethod, and it will override the standard one for all
constructors in the class.
[Update: the $self here should probably read "::T".]
BUILDALL
submethod BUILDALL ($self: *%args) {...}
[Update: but this one actually is the new self, the candidate created
by "CREATE". It's just not entirely instantiated until "BUILDALL" is
done.]
After the data structure is created, it must be populated by each of
the participating classes (and roles) in the proper order. The
"BUILDALL" method is called upon to do this. The default "BUILDALL" is
usually correct, so you don't generally have to override it. In
essence, it delegates the initialization of parent classes to the
"BUILDALL" of the parent classes, and then it calls "BUILD" on the
current class. In this way the pieces of the object are assembled in
the correct order, from least derived to most derived.
For each class "BUILDALL" calls on, if the arguments contain a pair
whose key is that class name, it passes the value of the pair as its
argument to that class's "BUILDALL". Otherwise it passes the entire
list. (There's not much ambiguity there--most classes and roles will
start with upper case, while most attribute names start with lower
case.)
BUILD
submethod BUILD ($self: *%args) {...}
That is the generic signature of "BUILD" from the viewpoint of the
caller, but the typical "BUILD" routine declares explicit parameters
named after the attributes:
submethod BUILD (+$tail, +@legs, *%extraargs) {
$.tail = $tail;
@:legs = @legs;
...
}
[Update: that would be "(:$tail, :@legs, *%extraargs)" now, but you
don't really need the colons since positional declarations can still be
set by name.]
That occurs so frequently that there's a shorthand available in the
signature declaration. You can put the attributes (distinguished by
those secondary sigils, you'll recall) right into the signature. The
following means essentially the same thing, without repeating the
names:
submethod BUILD (+$.tail, +@:legs, *%extraargs) {...}
[Update: @!legs now.]
It's actually unnecessary to declare the *%extraargs parameter. If you
leave it out, it will default to *%_ (but only on methods and
submethods--see the section on Interface Consistency later).
You may use this special syntax only for instance attributes, not class
attributes. Class attributes should generally not be reinitialized
every time you make a new object, after all.
If you do not declare a "BUILD" routine, a default routine will be
supplied that initializes any attributes whose names correspond to the
keys of the argument pairs passed to it, and leaves the other
attributes to default to whatever the class supplied as the default, or
"undef" otherwise.
In any event, the assignment of default attribute values happens
automatically. For any attribute that is not otherwise initialized,
the attribute declaration's ""build"" property is evaluated and the
resulting value copied in to the newly created attribute slot. This
happens logically at the end of the "BUILD" block, so we avoid running
initialization closures unnecessarily. This implicit initialization is
based not on whether the attribute is undefined, but on whether it was
initialized earlier in "BUILD". (Otherwise we could never explicitly
create an attribute with an undefined value.)
Eliminating Redundancy in Constructor Calls
If you say:
my Dog $spot = Dog.new(...)
you have to repeat the type. That's not a big deal for a small
typename, but sometime typenames are a lot longer. Plus you'd like to
get rid of the redundancy, just because it's, like, redundant. So
there's a variant on the dot operator that looks a lot like a dot
assignment operator:
my Dog $spot .= new(...)
It doesn't really quite fit the assignment operator rule though. If it
did, it'd have to mean
my Dog $spot = $spot.new(...)
which doesn't quite work, because $spot is undefined. What probably
happens is that the "my" cheats and puts a version of "undef" in there
that knows it should dispatch to the "Dog" class if you call
".self:new()" on it. Anyway, we'll make it work one way or another, so
that it becomes the equivalent of:
my Dog $spot = Dog.new(...)
The alternative is to go the C++ route and make "new" a reserved word.
We're just not gonna do that.
Note that an attribute declaration of the form
has Tail $wagger .= new(...)
might not do what you want done when you want it done, if what you want
done is to create a new "Dog" object each time an object is built. For
that you'd have to say:
has Tail $wagger = { .new(...) }
or equivalently,
has Tail $wagger will build { .new(...) }
But leaving aside such timing issues, you should generally think of the
".=" operator more as a variant on "." than a variant on "+=". It can,
for instance, turn any non-mutating method call into a mutating method:
@array.=sort; # sort @array in place
.=lc; # lowercase $_ in place
This presumes, of course, that the method's invocant and return value
are of compatible types. Some classes will wish to define special in-
place mutators. The syntax for that is:
method self:sort (Array @a is rw) {...}
[Update: That's now "self:<sort>" instead.]
It is illegal to use "return" from such a routine, since the invocant
is automatically returned. If you do not declare the invocant, the
default invocant is automatically considered ""rw"". If you do not
supply a mutating version, one is autogenerated for you based on the
corresponding copy operator.
Object Deconstruction
Object destruction is no longer guaranteed to be "timely" in Perl 6.
It happens when the garbage collector gets around to it. (Though there
will be ways to emulate Perl 5 end-of-scope cleanup.)
As with object creation, object destruction is recursive. Unlike
creation, it must proceed in the opposite order.
DESTROYALL
The "DESTROYALL" routine is the counterpart to the "BUILDALL" routine.
Similarly, the default definition is normally sufficient for the needs
of most classes. "DESTROYALL" first calls "DESTROY" on the current
class, and then delegates to the "DESTROYALL" of any parent classes.
In this way the pieces of the object are disassembled in the correct
order, from most derived to least derived.
DESTROY
As with Perl 5, all the memory deallocation is done for you, so you
really only need to define "DESTROY" if you have to release external
resources such as files.
Since "DESTROY" is the opposite of "BUILD", if any attribute
declaration has a ""destroy"" property, that property (presumably a
closure) is evaluated before the main block of "DESTROY". This happens
even if you don't declare a "DESTROY".
(The ""build"" and ""destroy"" traits are the only way for roles to let
their preferences be made known at "BUILD" and "DESTROY" time. It
follows that any role that does not define an attribute cannot
participate in building and destroying except by defining a method that
"BUILD" or "DESTROY" might call. In other words, stateless roles
aren't allowed to muck around with the object's state. This is
construed as a feature.)
Dispatch Mechanisms
Perl 6 supports both single dispatch (traditional OO) and multiple
dispatch (also known as "multimethod dispatch", but we try to avoid
that term).
Single Dispatch
Single dispatch looks up which method to run solely on the basis of the
type of the first argument, the invocant. A single-dispatch call
distinguishes the invocant syntactically (unlike a multiple-dispatch
call, which looks like a subroutine call, or even an operator.)
Basically, anything can be an invocant as long as it fills the
"Dispatch" role, which provides a ".dispatcher" method. This includes
ordinary objects, class objects, and (in some cases) even varieties of
"undef" that happen to know what class of thing they aren't (yet).
Simple single dispatch is specified with the dot operator, or its
indirect object equivalent:
$object.meth(@args) # always calls public .meth
.meth(@args) # unary form
meth $object: @args # indirect object form
There are variants on the dot form indicated by the character after the
dot. (None of these variants allows indirect object syntax.) The
private dispatcher only ever dispatches to the current class or its
proxies, so it's really more like a subroutine call in disguise:
$object.:meth(@args) # always calls private :meth
.:meth(@args) # unary form
[Update: uses "!" now, no unary form.]
It is an error to use ".:" unless there is a correspondingly named
"colon" method in the appropriate class, just as it is an error to use
"." when no method can be found of that name. Unlike the ".:"
operator, which can have only one candidate method, the "." operator
potentially generates a list of candidates, and allows methods in that
candidate list to defer to subsequent methods in other classes until a
candidate has been found that is willing to handle the dispatch.
In addition to the ".:" and ".=" operators, there are three other dot
variants that can be used if it's not known how many methods are
willing to handle the dispatch:
$object.?meth(@args) # calls method if there is one
.?meth(@args) # unary form
$object.*meth(@args) # calls all methods (0 or more)
.*meth(@args) # unary form
$object.+meth(@args) # calls all methods (1 or more)
.+meth(@args) # unary form
The ".*" and ".+" versions are generally only useful for calling
submethods, or methods that are otherwise expected to work like
submethods. They return a list of all the successful return values.
The ".?" operator either returns the one successful result, or undef if
no appropriate method is found. Like the corresponding regex
modifiers, "?" means "0 or 1", while "*" means "0 or more", and "+"
means "1 or more". Ordinary "." means "exactly one". Here are some
sample implementations, though of course these are probably implemented
in C for maximum efficiency:
# Implements . (or .? if :maybe is set).
sub CALLONE ($obj, $methname, +$maybe, *%opt, *@args) {
my $startclass = $obj.dispatcher() // fail "No dispatcher: $obj";
METHOD:
for WALKMETH($startclass, :method($methname), %opt) -> &meth {
return meth($obj, @args);
}
fail qq(Can't locate method "$methname" via class "$startclass")
unless $maybe;
return;
}
With this dispatcher you can continue by saying ""next METHOD"". This
allows methods to "failover" to other methods if they choose not to
handle the request themselves.
# Implements .+ (or .* if :maybe is set).
# Add :force to redispatch in every class
sub CALLALL ($obj, $methname, +$maybe, +$force, *%opt, *@args) {
my $startclass = $obj.dispatcher() // fail "No dispatcher: $obj";
my @results = gather {
if $force {
METHOD:
for WALKCLASS($startclass, %opt) -> $class {
take $obj.::($class)::$methname(*@args) # redispatch
}
}
else {
METHOD:
for WALKMETH($startclass, :method($methname), %opt) -> &meth {
take meth($obj,*@args);
}
}
}
return @results if @results or $maybe;
fail qq(Can't locate method "$methname" via class "$startclass");
}
This one you can quit early by saying ""last METHOD"". Notice that
both of these dispatchers cheat by calling a method as if it were a
sub. You may only do that by taking a reference to the method, and
calling it as a subroutine, passing the object as the first argument.
This is the only way to call a virtual method non-virtually in Perl.
If you try to call a method directly as a subroutine, Perl will ignore
the method, look for a subroutine of that name elsewhere, probably not
find it, and complain bitterly. (Or find the wrong subroutine, and
execute it, after which you will complain bitterly.)
We snuck in an example the new "gather"/"take" construct. It is still
somewhat conjectural.
Calling Superclasses, and Not-So-Superclasses
Perl 5 supplies a pseudoclass, "SUPER::", that redirects dispatch to a
parent class's method. That's often the wrong thing to do, though, in
part because under MI you may have more than one parent class, and also
because you might have sibling classes that also need to have the given
method triggered. Even if "SUPER" is smart enough to visit multiple
parent classes, and even if all your classes cooperate and call "SUPER"
at the right time, the depth first order of visitation might be the
wrong order, especially under diamond inheritance. Still, if you know
that your parent classes use "SUPER", or you're calling into a language
with "SUPER" semantics (such as Perl 5) then you should probably use
"SUPER" semantics too, or you'll end up calling your parent's parents
in duplicate. However, since use of "SUPER" is slightly discouraged,
we Huffman code it a bit longer in Perl 6. Remember the *%opt
parameters to the dispatchers above? That comes in as a parameterized
pseudoclass called "WALK".
$obj.*WALK[:super]::method(@args)
That limits the call to only those immediate super classes that define
the method. Note the star in the example. If you really want the Perl
5 semantics, leave the star out, and you'll only get the first existing
parent method of that name. (Why you'd want that is beyond me.)
Actually, we'll probably still allow "SUPER::" as a shorthand for
"WALK[:super]::", since people will just hack it in anyway if we don't
provide it...
If you think about it, every ordinary dispatch has an implicit "WALK"
modifier on the front that just happens to default to
"WALK[:canonical]". That is, the dispatcher looks for methods in the
canonical order. But you could say "WALK[:depth]" to get Perl 5's
order, or you could say "WALK[:descendant]" to get an order
approximating the order of construction, or "WALK[:ascendant]" to get
an order approximating the order of destruction. You could say
"WALK[:omit(SomeClass)]" to call all classes not equivalent to or
derived from "SomeClass". For instance, to call all super classes, and
not just your immediate parents, you could say "WALK[:omit(::_)]" to
skip the current lexical class or anything derived from it.
[Update: The lexical class is now named "::?CLASS".]
But again, that's not usually the right thing to do. If your base
classes are all willing to cooperate, it's much better to simply call
$obj.method(@args)
and then let each of the implementations of the method defer to the
next one when they're done with their part of it. If any method says
""next METHOD"", it automatically iterates the loop of the dispatcher
and finds the next method to dispatch to, even if that method comes
from a sibling class rather than a parent class. The next method is
called with the same arguments as originally supplied.
That presupposes that the entire set of methods knows to call "next"
appropriately. This is not always the case. In fact, if they don't
all call next, it's likely that none of them does. And maybe just
knowing whether or not they do is considered a violation of
encapsulation. In any case, if you still want to call all the methods
without their active cooperation, then use the star form:
$obj.*method(@args)
Then the various methods don't have to do anything to call the next
method--it happens automatically by default. In this case a method has
to do something special if it wants to stop the dispatch. Naturally,
that something is to call ""last METHOD"", which terminates the
dispatch loop early.
Now, sometimes you want to call the next method, but you want to change
the arguments so that the next method doesn't get the original argument
list. This is done with deep magic. If you use the "call" keyword in
an ordinary (nonwrapper) method, it steals the rest of the dispatch
list from the outer loop and redispatches to the next method with the
new arguments:
@retvals = call(@newargs)
return @retvals;
And unlike with ""next METHOD"", control returns to this method
following the call. It returns the results of the subsequent method
calls, which you should return so that your outer dispatcher can add
them to the return values it already gathered.
Note that ""next METHOD"" and ""last METHOD"" can typically be spelt
""next"" and ""last"" unless they are in an inner loop.
Parallel Dispatch
By default the various dot operators call a method on a single object,
even if it ends up calling multiple methods for that object. Since a
method call is essentially a unary postfix operator, however, you can
use it as a hyper operator on a list of objects:
@objectX.meth(@args) # Call one for each or fail
@objectX.?meth(@args) # Call one for each if available
@objectX.*meth(@args) # Call all available for each
@objectX.+meth(@args) # Call one or more for each
Note that with the last two, if a method uses ""last METHOD"", it
doesn't bomb out of the "hyper" loop, but just goes on to the next
entry. One can always bomb out of the hyperloop with a real exception,
of course. And maybe with ""last HYPER"", depending on how hyper's
implicit iteration is implemented.
If you want to use an array for serial rather than parallel method
calling, see Delegation, which lets you set up cascading handlers.
WALKCLASS and WALKMETH Caching
"WALKCLASS" generates a list of matching classes. "WALKMETH" generates
a list of method references from matching classes.
The "WALKCLASS" and "WALKMETH" routines used in the sample dispatch
code need to cache their results so that every dispatch doesn't have to
traverse the inheritance tree again, but just consult the
preconstructed list in order. However, if there are changes to any of
the classes involved, then someone needs to call the appropriate cache
clear method to make sure that the inheritance is recalculated.
"WALKCLASS"/"WALKMETH" options include some that specify ordering:
:canonical # canonical dispatch order
:ascendant # most-derived first, like destruction order
:descendant # least-derived first, like construction order
:preorder # like Perl 5 dispatch
:breadth # like multimethod dispatch
and some that specify selection criteria:
:super # only immediate parent classes
:method(Str) # only classes containing method declaration
:omit(Selector) # only classes that don't match selector
:include(Selector) # only classes that match selector
Note that ":method(Str)" selects classes that merely have methods
declared, not necessarily defined. A declaration without a definition
probably implies that they intend to autoload a definition, so we
should call the stub anyway. In fact, Perl 6 differentiates an
"AUTOMETHDEF" from "AUTOLOAD". "AUTOLOAD" works as it does in Perl 5.
"AUTOMETHDEF" is never called unless there is already a declaration of
the stub (or equivalently, "AUTOMETH" faked a stub.)
It would be possible to just define everything in terms of "WALKCLASS",
but that would imply looking up each method name twice, once inside
"WALKCLASS" to see if the method exists in the current class, and once
again outside in order to call it. Even if "WALKCLASS" caches the
cache list, it wouldn't cache the derived method list, so it's better
to have a separate cache for that, controlled by "WALKMETH", since
that's the common case and has to be fast.
(Again, this is all abstract, and is probably implemented in gloriously
grungy C code. Nevertheless, you can probably call "WALKCLASS" and
"WALKMETH" yourself if you feel like writing your own dispatcher.)
Multiple Dispatch
Multiple dispatch is based on the notion that methods often mediate the
relationships of multiple objects of diverse types, and therefore the
first object in the argument list should not be privileged over other
objects in the argument list when it comes to selecting which method to
run. In this view, methods aren't subservient to a particular class,
but are independent agents. A set of independent-minded, identically
named methods use the class hierarchy to do pattern matching on the
argument list and decide among themselves which method can best handle
the given set of arguments.
The Perl approach is, of course, that sometimes you want to distinguish
the first invocant, and sometimes you don't. The interaction of these
two approaches gets, um, interesting. But the basic notion is to let
the caller specify which approach is expected, and then, where it makes
sense, fall back on the other approach when the first one fails.
Underlying all this is the Principle of Least Surprise. Do not confuse
this with the Principle of Zero Surprise, which usually means you've
just swept the real surprises under some else's carpet. (There's a
certain amount of surprise you can't go below--the Heisenberg
Uncertainty Principle applies to software too.)
With traditional multimethods, all methods live in the same global
namespace. Perl 6 takes a different approach--we still keep all the
traditional Perl namespaces (lexical, package, global) and we still
search for names the same way (outward through the lexical scopes, then
the current package, then the global "*" namespace; or upward in the
class hierarchy). Then we simply claim that, under multiple dispatch,
the "long name" of any multi routine includes its signature, and that
visibility is based on the long name. So an inner or derived multi
only hides an outer or base multi of the same name and the same
signature. (Routines not declared ""multi"" still hide everything in
the traditional fashion.)
To put it another way, the multiple dispatch always works when both the
caller and the callee agree that that's how it should work. (And in
some cases it also works when it ought to work, even if they don't
agree--sort of a "common law" multimethod, as it were...)
Declaration of Multiple Dispatch Routines
A callee agrees to the multiple dispatch "contract" by including the
word ""multi"" in the declaration of the routine in question. It
essentially says, "Ordinarily this would be a unique name, but it's
okay to have duplicates of this name (the short name) that are
differentiated by signatures (the long name)."
Looking at it from the other end, leaving the ""multi"" out says "I am
a perfect match for any signature--don't bother looking any further
outward or upward." In other words, the standard non-multi semantics.
You may not declare a multi in the same scope as a non-multi. However,
as long as they are in different scopes, you can have a single non-
multi inside a set of multis, or a set of multis inside a single non-
multi. You can even have a set of multis inside a non-multi inside a
set of multis. Indeed, this is how you hide all the outer multis so
that only the inner multi's long names are considered. (And if no long
name matches, you get the intermediate non-multi as a kind of
backstop.) The same policy applies to both nested lexical scopes and
derived subclasses.
[Update: now you can declare a "first multi" that is simulataneously an
ordinary sub and the final arbiter of subsequent multi declarations in
this scope. This is done with the ""proto"" keyword in place of the
""multi"" keyword". Oh, and the ""sub"" is optional with either
""multi"" or ""proto"".]
Actually, up till now we've been oversimplifying the concept of "long
name" slightly. The long name includes only that part of the signature
up to the first colon. If there is no colon, then the entire signature
is part of the long name. (You can have more colons, in which case the
additional arguments function as tie breakers if the original set of
long names is insufficient to prevent a tie.)
So sometimes we'll probably slip and say "signature" when we mean "long
name". We pray your indulgence.
multi sub
A multi sub in any scope hides any multi sub with the same "long name"
in any outer scope. It does not hide subs with the same short name but
a different signature. Er, long name, I mean...
multi sub * (tradition multimethods)
If you want a multi that is visible in all namespaces (that don't hide
the long name), then declare the name in the global name space,
indicated in Perl 6 with a "*". Most of the so-called "built-ins" are
declared this way:
multi sub *push (Array $array, *@args) {...}
multi sub *infix:+ (Num $x, Num $y) returns Num {...}
multi sub *infix:.. (Int $x, Int $y: Int ?$by) returns Ranger {...}
[Update: Those are now named "infix:<+>" and "infix:<..>".]
Note the use of colon in the last example to exclude $by as part of the
long name. The range operator is dispatched only on the types of its
two main arguments.
multi method
If you declare a method with "multi", then that method hides any base
class method with the same long name. It does not hide methods with
the same short name but a different signature when called as a
multimethod. (It does hide methods when called under single dispatch,
in which case the first invocant is treated as the only invocant
regardless of where you put the colon. Just because a method is
declared with "multi" doesn't make it invisible to single dispatch.)
Unlike a regular method declaration, there is no implied invocant in
the syntax of a multi method. A method declared as multi must declare
all its invocants so that there's no ambiguity as to the meaning of the
first colon. With a multi method, it always means the end of the long
name. (With a non-multi, it always means that the optional invocant
declaration is present.)
multi submethod
Submethods may be declared with "multi", in which case visibility works
the same as for ordinary methods. However, a submethod has the
additional constraint that the first invocant must be an exact class
match. Which effectively means that a submethod is first single
dispatched to the class, and then the appropriate submethod within that
class is selected, ignoring any other class's submethods of the same
name.
multi rule
Since rules are just methods in disguise, you can have multi rules as
well. (Of course, that doesn't do you a lot of good unless you have
rules with different signatures, which is unusual.)
multi submethod BUILD
It is not likely that Perl 6.0.0 will support multiple dispatch on
named arguments, but only on positional arguments. Since all the extra
arguments to a "BUILD" routine come in as named arguments, you probably
can't usefully multi a "BUILD" (yet). However, we should not do
anything that precludes multiple "BUILD" submethods in the future.
Which means we should probably enforce the presence of a colon before
the first named argument declaration in any multi signature, so that
the semantics don't suddenly change if and when we start supporting
multiple dispatch that includes named arguments as part of the long
name.
multi method constructors
To the extent that you declare constructors (such as ".new") with
positional arguments, you can use "multi" on them in 6.0.0.
Calling via Multiple Dispatch
As we mentioned, multiple dispatch is enabled by agreement of both
caller and callee. From the caller's point of view, you invoke
multiple dispatch simply by calling with subroutine call syntax instead
of method call syntax. It's then up to the dispatcher to figure out
which of the arguments are invocants and which ones are just options.
(In the case where the innermost visible subroutine is declared non-
multi, this degenerates to the Perl 5 semantics of subroutine calls.)
This approach lets you refactor a simple subroutine into a more nuanced
set of subroutines without changing how the subroutines are called at
all. That makes this sort of refactoring drop-dead simple. (Or at
least as simple as refactoring ever gets...)
It's a little harder to refactor between single dispatch and multiple
dispatch, but a good argument could be made that it should be harder to
do that, because you're going to have to think through a lot more
things in that case anyway.
Anyway, here's the basic relationship between single dispatch and
multiple dispatch. Single dispatch is more familiar, so we'll discuss
multiple dispatch first.
Multiple dispatch semantics
Whenever you make a call using subroutine call syntax, it's a candidate
for multiple dispatch. A search is made for an appropriate subroutine
declaration. As in Perl 5, this search goes outward through the
lexical scopes, then through the current package and on to the global
namespace (represented in Perl 6 with an initial * for the "wildcard"
package name). If the name found is not a multi, then it's a good old-
fashioned sub call, and no multiple dispatch is done. End of story.
[Update: a "proto" counts as a "sub" here, not as a "multi".]
However, if the first declaration we come to is a multi, then lots of
interesting stuff happens. (Fortunately for our performance, most of
this interesting stuff can happen at compile time, or upon first use.)
The basic idea is that we will collect a complete list of candidates
before we decide which one to call.
So the search continues outward, collecting all sub declarations with
the same short name but different long names. (We can ignore outer
declarations that are hidden by an inner declaration with the same long
name.) If we run into a scope with a non-multi declaration, then we're
done generating our candidate list, and we can skip the next paragraph.
After going all the way out to the global scope, we then examine the
type of the first argument as if we were about to do single dispatch on
it. We then visit any classes that would have been single dispatched,
in most-derived to least-derived order, and for each of those classes
we add into our candidate list any methods declared multi, plus all the
single invocant methods, whether or not they were declared multi! In
other words, we just add in all the methods declared in the class as a
subset of the candidates. (There are reasons for this that we'll
discuss below.) Anyway, just as with nested lexical scopes, if two
methods have the same long name, the more derived one hides the less
derived one. And if there's a class in which the method of the same
short name is not declared multi, it serves as a "stopper", just as a
non-multi sub does in a lexical scope. (Though that "stopper" method
can of course redispatch further up the inheritance tree, just as a
"stopper" lexical sub can always call further outward if it wants to.)
[Update: we now have the "proto" keyword to specifically mark such a
routine. It also lets the routine supply known argument names for any
multis found in the lexical scope, so the compiler can be smart about
mapping named args to positionals.]
Now we have our list of candidates, which may or may not include every
sub and method with the same short name, depending on whether we hit a
"stopper". Anyway, once we know the candidate list, it is sorted into
order of distance from the actual argument types. Any exact match on a
parameter type is distance 0. Any miss by a single level of derivation
counts as a distance of 1. Any violation of a hard constraint (such as
having too many arguments for the number of parameters, or violating a
subtype check on a type that does constraint checking, or missing the
exact type on a submethod) is effectively an infinite distance, and
disqualifies the candidate completely.
Once we have our list of candidates sorted, we simply call the first
one on the list, unless there's more than one "first one" on the list,
in which case we look to see if one of them is declared to be the
default. If so, we call it. If not, we die.
So if there's a tie, the default routine is in charge of subsequent
behavior:
# Pick next best at random...
multi sub foo (BaseA $a, BaseB $b) is default {
next METHOD;
}
# Give up at first ambiguity...
multi sub bar (BaseA $a, BaseB $b) is default {
last METHOD;
}
# Invoke my least-derived ancestor
multi sub baz (BaseA $a, BaseB $b) is default {
my @ambiguities = WALKMETH($startclass, :method('baz'))
or last METHOD;
pop(@ambiguities).($a, $b);
}
# Invoke most generic candidate (often a good fall-back)...
multi sub baz (BaseA $a, BaseB $b) is default {
my @ambiguities = @CALLER::methods or last METHOD;
pop(@ambiguities).value.($a, $b);
}
In many cases, of course, the default routine won't redispatch, but
simply do something generically appropriate.
[Update: we probably aren't using the "Manhattan" distance indicated
above.]
Single dispatch semantics
If you use the dot notation, you are explicitly calling single
dispatch. By default, if single dispatch doesn't find a suitable
method, it does a "failsoft" to multiple dispatch, pretending that you
called a subroutine with the invocant passed as the first argument.
(Multiple dispatch doesn't need to failsoft to single dispatch since
all single dispatch methods are included as a subset of the multiple
dispatch candidates anyway.)
This failsoft behavior can be modified by lexically scoped pragma. If
you say
use dispatch :failhard
then single dispatch will be totally unforgiving as it is in Perl 5.
Or you can tell single dispatch to go away:
use dispatch :multi
in which case all your dot notation is treated as a sub call. That is,
any
$obj.method(1,2,3)
in the lexical scope acts like you'd said:
method($obj,1,2,3)
If single dispatch locates a class that defines the method, but the
method in question turns out to be a set of one or more multi methods,
then, the single dispatch fails immediately and a multiple dispatch is
done, with the additional constraint that only multis within that class
are considered. (If you wanted the first argument to do loose matching
as well, you should have called it as a multimethod in the first
place.)
Indirect objects
If you use indirect object syntax with an explicit colon, it is exactly
equivalent to dot notation in its semantics.
However, one-argument subs are inherently ambiguous, because Perl 6
does not require the colon on indirect objects without arguments. That
is, if you say:
print $fh
it's not clear whether you mean
$fh.print
or
print($fh)
As it happens, we've defined the semantics so that it doesn't matter.
Since all single invocant methods are included automatically in
multimethod dispatch, and since multiple dispatch degenerates to single
dispatch when there's only one invocant, it doesn't matter which way
you write it. The effect is the same either way. (Unless you've
defined your own non-multi print routine in a surrounding lexical
scope. But then, if you've done that, you probably did it on purpose
precisely because you wanted to disable the default dispatch
semantics.)
Meaning of "next METHOD"
Within the context of a multimethod dispatch, ""next METHOD"" means to
try the next best match, if unambiguous, or else the marked default
method. From within the default method it means just pick the next in
the list even if it's ambiguous. The dispatch list is actually kept in
@CALLER::methods, which is a list of pairs, the key of each indicating
the "distance" rating, and the value of each containing a reference to
the method to call (as a sub ref).
[Update: a proto routine can't use ""next METHOD"" because the original
dispatch list is exhausted. Maybe. If so, we can arrange for "call()"
to propagate the call outward, as if the proto routine is a wrapper
around the rest of the world.]
Making Fiends, er, Friends.
If you want to directly access the attributes of a class, your multi
must be declared within the scope of that class. Attributes are never
directly visible outside a class. This makes it difficult to write an
efficient multimethod that knows about the internals of two different
classes. However, it's possible for private accessors to be visible
outside your class under one condition. If your class declares that
another class is trusted, that other class can see the private
accessors of your class. If the other class declares that you are
trusted, then you can see its private accessor methods. The trust
relationship is not necessarily symmetrical. This lets you have an
architecture where classes by and large don't trust each other, but
they all trust a single well-guarded ""multi"-plexor" class that keeps
everyone else in line.
The syntax for trusting another class is simply:
class MyClass {
trusts Yourclass;
...
}
It's not clear whether roles should be allowed to grant trust. In the
absence of evidence to the contrary, I'm inclined to say not. We can
always relax that later if, after many large, longitudinal, double-
blind studies, it turns out to be both safe and effective.
[Update: the intent of the previous paragraph was to describe whether a
role could grant trust on behalf of the composed class, but it was
ambiguously stated, and sometimes misinterpreted. It seems to me that
a role could grant trust to private routines in its own scope, however,
since they're basically just sub calls anyway. On the other hand, the
relationship of the role's package namespace to its eventual lexical
namespace is perhaps problematic. At some point the object has to keep
track of all its cloned role closures and provide the right one to
private sub callers.]
Overloading
In Perl 5 overloading was this big special deal that had to have
special hooks inserted all over the C code to catch various operations
on overloaded types and do something special with them. In Perl 6,
that just all falls out naturally from multiple dispatch. The only
other part of the trick is to consider operators to be function calls
in disguise. So in Perl 6 the real name of an operator is composed of
a grammatical context identifier, a colon, and then the name of the
operator as you usually see it. The common context identifiers are
"prefix", "infix", "postfix", "circumfix", and "term", but there are
others.
So when you say something like
$x = <$a++ * -@b.[...]>;
you're really saying something like this:
$x = circumfix:<>(
infix:*(
postfix:++($a),
prefix:-(
infix:.(
@b,
circumfix:[](
term:...();
)
)
)
)
)
[Update: All the operator names need to be quoted now as a hash
subscript or slice. And instead of breaking ".[]" into "." and "[]",
there is now a "postcircumfix" grammatical category. So we have:
$x = circumfix:X< >X(
infix:<*>(
postfix:<++>($a),
prefix:<->(
postcircumfix:<[ ]>(
@b,
term:<...>();
)
)
)
)
Except that circumfix "<...>" is a quoting operator these days. many
of the examples using French quotes in this Apocalypse are now written
with regular angles.]
Perl 5 had special key names representing stringification and
numification. In Perl 6 these naturally fall out if you define:
method prefix:+ () {...} # what we do in numeric context
method prefix:~ () {...} # what we do in string context
[Update: Here and below, it's "prefix:<+>" etc.]
Likewise you can define what to return in boolean context:
method prefix:? () {...} # what we do in boolean context
Integer context is, of course, just an ordinary method:
method int () {...} # what we do in integer context
These can be defined as normal methods since single-invocant multi subs
degenerate to standard methods anyway. C++ programmers will tend to
feel comfy defining these as methods. But others may prefer to declare
them as multi subs for consistency with binary operators. In which
case they'd look more like this:
multi sub *prefix:+ (Us $us) {...} # what we do in numeric context
multi sub *prefix:~ (Us $us) {...} # what we do in string context
multi sub *prefix:? (Us $us) {...} # what we do in string context
multi sub *prefix:int (Us $us) {...} # what we do in integer context
Coercions to other classes can also be defined:
multi sub *coerce:as (Us $us, Them ::to) { to.transmogrify($us) }
Such coercions allow both explicit conversion:
$them = $us as Them;
as well as implicit conversions:
my Them $them = $us;
Binary Ops
Binary operators should generally be defined as multi subs:
multi sub infix:+ (Us $us, Us $ustoo) {...}
multi sub infix:+ (Us $us, Them $them) is commutative {...}
[Update: And these are "infix:<+>" now.]
The ""is commutative"" trait installs an additional autogenerated sub
with the invocant arguments reversed, but with the same semantics
otherwise. So the declaration above effectively autogenerates this:
multi sub infix:+ (Them $them, Us $us) {...}
Of course, there's no need for that if the two arguments have the same
type. And there might not actually be an autogenerated other
subroutine in any case, if the implementation can be smart enough to
simply swap the two arguments when it needs to. However it gets
implemented, note that there's no need for Perl 5's "reversed arguments
flag" kludge, since we reverse the parameter name bindings along with
the types. Perl 5 couldn't do that because it had no control of the
signature from the compiler's point of view.
See Apocalypse 6 for much more on the definition of user-defined
operators, their precedence, and their associativity. Some of it might
even still be accurate.
[Update: nowadays see Synopsis 6 instead.]
Class Composition with Roles
Objects have many kinds of relationships with other objects. One of
the pitfalls of the early OO movement was to encourage people to model
many relationships with inheritance that weren't really "isa"
relationships. Various languages have sought to redress this
deficiency in various ways, with varying degrees of success. With Perl
6 we'd like to back off a step and allow the user to define abstract
relationships between classes without committing to a particular
implementation.
More specifically, we buy the argument of the Traits paper (see
http://www.cse.ogi.edu/~black/publications/TR_CSE_02-012.pdf
<http://www.cse.ogi.edu/~black/publications/TR_CSE_02-012.pdf>) that
classes should not be used both to manage objects and to manage code
reuse. It needs to be possible to separate those concerns. Since a
lot of the code that people want to reuse is that which manages non-isa
object relationships, that's what we should abstract out from classes.
That abstraction we are calling a role. Roles can encompass both
interface and implementation of object relationships. A role without
implementation degenerates to an interface. A role without interface
degenerates to privately instantiated generics. But the typical role
will provide both interface and at least a default implementation.
Unlike the Traits paper, we will allow state as part of our
implementation. This is necessary if we are to abstract out the
delegation decision. We feel that the decision to delegate rather than
compose a sub-object is a matter of implementation, and therefore that
decision should be encapsulated (or at least be allowed to be
encapsulated) in a role. This allows you to refactor a problem by
redefining one or more roles without having to doctor all the classes
that make use of those roles. This is a great way to turn your huge,
glorious "god object" into a cooperating set of objects that know how
to delegate to each other.
As in the Traits paper, roles are composed at class construction time,
and the class composer does some work to make sure the composed class
is not unintentionally ambiguous. If two methods of the same name are
composed into the same class, the ambiguity will be caught. The author
of the class has various remedies for dealing with this situation,
which we'll go into below.
From the standpoint of the typical user, a role just looks like a
"smart" include of a "partial class". They're smart in that roles have
to be well behaved in certain respects, but most of the time the naive
user can ignore the power of the abstraction.
Declaration of Roles
A role is declared much like a class, but with a "role" keyword
instead:
role Pet {
method feed ($food) {
$food.open_can();
$food.put_in_bowl();
.call();
}
}
[Update: that'd have to be "self.call()" or "$.call()" these days.]
A role may not inherit from a class. It may be composed of other
roles, however. In essence, a role doesn't know its own type yet,
because it will be composed into another type. So if you happen to
make any mention of its main type (available as "::_"), that mention is
in fact generic. Therefore the type of $self is generic. Likewise if
you refer to "SUPER", the role doesn't know what the parent classes are
yet, so that's also generic. The actual types are instantiated from
the generic types when the role is composed into the class. (You can
use the role name (""Pet"") directly, but only in places where a role
name is allowed as a type constraint, not in places that declare the
type of an actual object.)
Just as the body of a class declaration is actually a method call on an
instance of the "MetaClass" class, so too the body of a role
declaration is actually a method call on an instance of the "MetaRole"
class, which is like the "MetaClass" class, with some tweaks to manage
"Role" objects instead of "Class" objects. For instance, a "Role"
object doesn't actually support a dispatcher like a "Class" object.
"MetaRole" and "MetaClass" do not inherit from each other. More likely
they both inherit from "MetaModule" or some such.
Parametric types
A role's main type is generic by default, but you can also parameterize
other types explicitly:
role Pet[Type $petfood = TableScraps] {
method feed (::($petfood) $food) {...}
}
[Update: with type sigils that would just be
role Pet[::Petfood = TableScraps] {
method feed (Petfood $food) {...}
}
now.]
Unlike certain other languages you may be altogether too familiar with,
Perl uses square brackets for parametric types rather than angles.
Within those square brackets it uses standard signature notation, so
you can also use the arguments to pass initial values, for instance.
Just bear in mind that by default any parameters to a role or class are
considered part of the name of the class when instantiated. Inasmuch
as instantiated type names are reminiscent of multimethod "long names",
you may use a colon to separate those arguments that are to be
considered part of the name from those that are just options.
Please note that these types can be as latent (or as non-latent) as you
like. Remember that what looks like compile time to you is actually
run time to the compiler, so it's free to bind types as early or late
as you tell it to, including not at all.
Interfaces
If a role merely declares methods without defining them, it degenerates
to an interface:
role Pet {
method feed ($food) {...}
method groom () {...}
method scratch (+$where) {...}
}
When such a role is included in a class, the methods then have to be
defined by the class that uses the role. Actually, each method is on
its own--a role is free to define default implementations for any
subset of the methods it declares.
Private interfaces
If a role declares private accessors, those accessors are private to
the class, not the role. The class must define any private
implementations that are not supplied by the role, just as with public
methods. But private method names are never visible outside the class
(except to its trusted proxy classes).
[Update: see S12 for refinements of this.]
Encapsulated Attributes
Unlike in the Traits paper, we allow roles to have state. Which is
fancy way of saying that the role can define attributes, and methods
that act on those attributes, not just methods that act only on other
methods.
role Pet {
has $.collar = { Collar.new(Tag.new) };
method id () { return $.collar.tag }
method lose_collar () { undef $.collar }
}
By the way, I think that when "$.collar" is undefined, calling ".tag"
on it should merely return "undef" rather than throwing an exception
(in the same way that @foo[$x][$y][$z] returns "undef" when @foo[$x] is
undefined, and for the same reason). The "undef" object returned
should, of course, contain an unthrown exception documenting the
problem, so that if the "undef" is ever asked to provide a defined
value, it can explain why it can't do so. Or if the returned value is
tested by "//", it can participate in the resulting error message.
If you want to parameterize the initial value of a role attribute, be
sure to put a colon if you don't want the parameter to be considered
part of the long name:
role Pet[IDholder $id: $tag] {
has IDholder $.collar .= new($tag);
}
class Dog does Pet[Collar, DogLicense("fido")] {...}
class Pigeon does Pet[LegBand, RacerId()] {...}
my $dog = new Dog;
my $pigeon = new Pigeon;
In which case the long names of the roles in question are "Pet[Collar]"
and "Pet[LegBand]". In which case all of these are true:
$dog.does(Dog)
$dog.does(Pet)
$dog.does(Pet[Collar])
but this is false:
$dog.does(Pet[LegBand])
Anyway, where were we. Ah, yes, encapsulated attributes, which leads
us to...
Encapsulated private attributes
We can also have private attributes:
has Nose $:sniffer .= new();
[Update: that's "$!sniffer" now.]
And encapsulated private attributes lead us to...
Encapsulated delegation
A role can abstract the decision to delegate:
role Pet {
has $:groomer handles Xbathe groom trimX = hire_groomer();
}
Now when the "Dog" or "Cat" class incorporates the "Pet" role, it
doesn't even have to know that the ".groom" method is delegated to a
professional groomer. (See section on Delegation below.)
Encapsulated Inheritance
It gets worse. Since you can specify inheritance with an "is"
declaration within a class, you can do the same with a role:
role Pet {
is Friend;
}
Note carefully that this is not claiming that a "Pet" ISA "Friend"
(though that might be true enough). Roles never inherit. So this is
only saying that whatever animal takes on the role of "Pet" gets some
methods from "Friend" that just happen to be implemented by inheritance
rather than by composition. Probably "Friend" should have been written
as a role, but it wasn't (perhaps because it was written in Some Other
Language that runs on Parrot), and now you want to pretend that it was
written as a role to get your project out the door. You don't want to
use delegation because there's only one animal involved, and
inheritance will work good enough till you can rewrite "Friend" in a
language that supports role playing.
Of course, the really funny thing is that if you go across a language
barrier like that, Perl might just decide to emulate the inheritance
with delegation anyway. But that should be transparent to you. And if
two languages manage to unify their object models within the Parrot
engine, you don't want to suddenly have to rewrite your roles and
classes.
And the really, really funny thing is that Parrot implements roles
internally with a funny form of multiple inheritance anyway...
Ain't abstraction wonderful.
Use of Roles at Compile Time
Roles are most useful at compile time, or more precisely, at class
composition time, the moment in which the "MetaClass" class is figuring
out how to put together your "Class" object. Essentially, that's while
the closure associated with your class is being executed, with a little
extra happening before and after.
A class incorporates a role with the verb "does", like this:
class Dog is Mammal does Pet does Sentry {...}
or equivalently, within the body of the class closure:
class Dog {
is Mammal;
does Pet;
does Sentry;
...
}
There is no ordering dependency among the roles, so it doesn't matter
above if "Sentry" comes before "Pet". That is because the class just
remembers all the roles and then meshes them after the closure is done
executing.
Each role's methods are incorporated into the class unless there is
already a method of that name defined in the class itself. A class's
method definition hides any role definition of the same name, so role
methods are second-class citizens. On the other hand, role methods are
still part of the class itself, so they hide any methods inherited from
other classes, which makes ordinary inherited methods third-class
citizens, as it were.
If there are no method name conflicts between roles (or with the
class), then each role's methods can be installed in the class, and
we're done. (Unless we wish to do further analysis of role
interrelationships to make sure that each role can find the methods it
depends on, in which case we can do that. But for 6.0.0 I'll be happy
if non-existent methods just fail at run time as they do now in Perl
5.)
If, however, two roles try to introduce a method of the same name (for
some definition of name), then the composition of the class fails, and
the compilation of the program blows sky high--we sincerely hope. It's
much better to catch this kind of error at compile time if you can.
And in this case, you can.
Conflict Resolution
There are several ways to solve conflicts. The first is simply to
write a class method that overrides the conflicting role methods,
perhaps figuring out which role method to call. It is allowed to use
the role name to select one of the hidden role methods:
method shake ($self: $arg) {
given $arg {
when Culprit { $self.Sentry::shake($arg) }
when Paw { $self.Pet::shake($arg) }
}
}
So even though the methods were not officially composed into the class,
they're still there--they're not thrown away.
That last example looks an awful lot like multiple dispatch, and in
fact, if you declare the roles' methods with "multi", they would be
treated as methods with different "long names", provided their
signatures were sufficiently different.
An interesting question, though, is whether the class can force two
role methods that weren't declared "multi" to behave as if they were.
Perhaps this can be forced if the class declares a signatureless multi
stub without defining it later in the class:
multi shake {...}
The Traits paper recommends providing ways of renaming or excluding one
or the other of the conflicting methods. We don't recommend that,
because it's better if you can keep both contracts through multiple
dispatch to the role methods. However, you can force renaming or
exclusion by pretending the role is a delegation:
does Pet handles [ :myshakeXshakeX, Any ];
does Pet handles { $^name !~ "shake" };
Or something that. (See the section on Delegation below.) If we can't
get that to work right, you can always say something like:
method shake { .Sentry::shake(@_) } # exclude Pet::shake
method handshake { .Pet::shake(@_) } # rename Pet::shake
In many ways that's clearer than trying to attach a selection syntax to
"does".
Use of Roles at Run Time (mixins)
While roles are at their most powerful at compile time, they can also
function as mixin classes at run time. The "does" binary operator
performs the feat of deriving a new class and binding the object to it:
$fido does Sentry
Actually, it only does this if $fido doesn't already do the "Sentry"
role. If it does already, this is basically a no-op. The "does"
operator works on the object in place. It would be illegal to say, for
instance,
0 does true
The "does" operator returns the object so you can nest mixins:
$fido does Sentry does Tricks does TailChasing does Scratch;
Unlike the compile-time role composition, each of these layers on a new
mixin with a new level of inheritance, creating a new anonymous class
for dear old Fido, so that a ".chase" method from "TailChasing" hides a
".chase" method from "Sentry".
(Do not confuse the binary "does" with the unary "does" that you use
inside a class definition to pull in a role.)
In contrast to "does", the "but" operator works on a copy. So you can
say:
0 but true
and you get a mixin based on a copy of 0, not the original 0, which
everyone shares. One other wrinkle is that "true" isn't, in fact, a
class name. It's an enumerated value of a bit class. So what we said
was a shorthand for something like:
0 but bit::true
[Update: that's "bool::True" these days.]
In earlier Apocalypses we talked about applying properties with "but".
This has now been unified with mixins, so any time you say:
$value but prop($x)
you're really doing something more like
$tmp = $value; # make a copy
$tmp does SomeRole; # guarantee there's a rw .prop method
$tmp.prop = $x; # set the prop method
And therefore a property is defined by a role like this:
role SomeRole {
has SomeType $.prop is rw = 1;
}
This means that when you mention ""prop"" in your program, something
has to know how to map that to the "SomeRole" role. That would often
be something like an enum declaration. It's illegal to use an
undeclared property. But sometimes you just want a random old property
for which the role has the same name as the property. You can declare
one with
my property answer;
and that essentially declares a role that looks something like
my role answer {
has $.answer is rw = 1;
}
Then you can say
$a = 0 but answer(42)
and you have an object of an anonymous type that "does" "answer", and
that include a ".answer" accessor of the same name, so that if you call
"$a.answer", you'll get back 42. But $a itself has the value 0. Since
the accessor is ""rw"", you can also say
$a.answer = 43;
There's a corresponding assignment operator:
$a but= tainted;
That avoids copying $a before tainting it. It basically means the same
thing as
$a does taint::tainted
For more on enumerated types, see Enums below.
Traits
Here we're talking about Perl's traits (as in compile-time properties),
not Traits (as in the Traits paper).
Traits can be thought of as roles gone wrong. Like roles, they can
function as straightforward mixins on container objects at compile
time, but they can also cheat, and frequently do. Unlike roles, traits
are not constrained to play fair with each other. With traits, it's
both "first come, first served", and "he who laughs last laughs best".
Traits are applied one at a time to their container victim, er, object,
and an earlier trait can throw away information required by a later
trait. Contrariwise, a later trait can overrule anything done by an
earlier trait--except of course that it can't undestroy information
that has been totally forgotten by the earlier trait.
You might say that "role" is short for "role model", while "trait" is
short for "traitor". In a nutshell, roles are symbiotes, while traits
are parasites. Nevertheless, some parasites are symbiotic, and some
symbiotes are parasitic. Go figure...
All that being said, well-behaved traits are really just roles applied
to declared items like containers or classes. It's the declaration of
the item itself that makes traits seem more permanent than ordinary
properties. The only reason we call them "traits" rather than
"properties" is to continually remind people that they are, in fact,
applied at compile time. (Well, and so that we can make bad puns on
"traitor".)
Even ill-behaved traits should add an appropriately named role to the
container, however, in case someone wants to look at the metadata
properties of the container.
Traits are generally inflicted upon the "traitee" with the "is"
keyword, though other modalities are possible. When the compiler sees
words like "is" or "will" or "returns" or "handles", or special
constructs like signatures and body closures, it calls into an
associated trait handler which applies the role to the item as a mixin,
and also does any other traitorous magic that needs doing.
To define a trait handler for an "is xxx" trait, define one or more
multisubs into a property role like this:
role xxx {
has Int $.xxx;
multi sub trait_auxiliary:is(xxx $trait, Class $container: ?$arg) {...}
multi sub trait_auxiliary:is(xxx $trait, Any $container: ?$arg) {...}
}
[Update: That's "trait_auxiliary:<is>" now.]
Then it can function as a trait. A well-behaved trait handler will say
$container does xxx($arg);
somewhere inside to set the metadata on the container correctly. Then
not only can you say
class MyClass is xxx(123) {...}
but you'll also be able to say
if MyClass.meta.xxx == 123 {...}
Since a class can function as a role when it comes to parameter type
matching, you can also say:
class MyBase {
multi sub trait_auxiliary:is(MyBase $base, Class $class: ?$arg) {...}
multi sub trait_auxiliary:is(MyBase $tied, Any $container: ?$arg) {...}
}
These capture control if "MyBase" wants to capture control of how it
gets used by any class or container. But usually you can just let it
call the generic defaults:
multi sub *trait_auxiliary:is(Class $base, Class $class: ?$arg) {...}
which adds $base to the "isa" list of $class, or
multi sub *trait_auxiliary:is(Class $tied, Any $container: ?$arg) {...}
which sets the "tie" type of the container to the implementation type
in $tied.
In any event, if the trait supplies the optional argument, that comes
in as $arg. (It's probably something unimportant, like the function
body...) Note that unlike "pair options" such as "":wag"", traits do
not necessarily default to the value 1 if you don't supply the
argument. This is consistent with the notion that traits don't
generally do something passive like setting a value somewhere, but
something active like totally screwing up the structure of your
container.
Most traits are introduced by use of a "helping verb", which could be
something like ""is"", or ""will"", or ""can"", or ""might"", or
""should"", or ""does"". We call these helping verbs "trait
auxiliaries". Here's ""will"", which (being syntactic sugar) merely
delegates to back to "is":
multi sub *trait_auxiliary:will($trait, $container: &arg) {
trait_auxiliary:is($trait, $container, &arg);
}
Note the declaration of the argument as a non-optional reference to a
closure. This is what allows us to say:
my $dog will eat { anything() };
rather than having to use parens:
my $dog is eat({ anything() });
Other traits are applied with a single word, and we call one of those a
"trait verb". For instance, the ""returns"" trait described in
Apocalypse 6 is defined something like this:
role returns {
has ReturnType $.returns;
multi sub trait_verb:returns($container: ReturnType $arg) {
$container does returns($arg);
}
...
}
[Update: Make that "trait_verb:<returns>" now.]
Note that the argument is not optional on ""returns"".
Earlier we defined the "xxx" trait using multi sub definitions:
role xxx {
has Int $.xxx;
multi sub trait_auxiliary:is(xxx $trait, Class $container: ?$arg) {...}
multi sub trait_auxiliary:is(xxx $trait, Any $container: ?$arg) {...}
}
This is one of those situations in which you may really want single-
dispatch methods:
role xxx {
has Int $.xxx;
method trait_auxiliary:is(xxx $trait: Class $container, ?$arg) {...}
method trait_auxiliary:is(xxx $trait: Any $container, ?$arg) {...}
}
Some traits are control freaks, so they want to make sure that anything
mentioning them comes through their control. They don't want something
dispatching to another trait's "trait_auxiliary:is" method just because
someone introduced a cute new container type they don't know about.
That other trait would just mess things up.
Of course, if a trait is feeling magnanimous, it should just go ahead
and use multi subs. Since the multi-dispatcher takes into account
single-dispatch methods, and the distance of an exact match on the
first argument is 0, the dispatcher will generally respect the wishes
of both the paranoid and the carefree.
Note that we included "does" in our list of "helping verbs". Roles
actually implement themselves using the trait interface, but the
generic version of "trait_auxiliary:does" defaults to doing proper
roley things rather than proper classy things or improper traitorous
things. So yes, you could define your own "trait_auxiliary:does" and
turn your nice role traitorous. That would be...naughty.
But apart from how you typically invoke them, traits and roles are
really the same thing. Just like the roles on which they're based, you
may neither instantiate nor inherit from a trait. You may, however,
use their names as type constraints on multimethod signatures and such.
As with well-behaved roles, they should define attributes or methods
that show up as metadata properties where that's appropriate. Unlike
compile-time roles, which all flatten out in the same class, compile-
time traits are applied one at a time, like mixin roles. You can, in
fact, apply a trait to a container at run time, but if you do, it's
just an ordinary mixin role. You have to call the appropriate
"trait_auxiliary:is()" routine yourself if you want it to do any extra
shenanigans. The compiler won't call it for you at run time like it
would at compile time.
When you define a helping verb such as "is" or "does", it not only
makes it a postfix operator for declarations, but a unary operator
within class and role closures. Likewise, declarative closure blocks
like "BEGIN" and "INIT" are actually trait verbs, albeit ones that can
add multiple closures to a queue rather than adding a single property.
This implies that something like
sub foo {
LEAVE {...}
...
}
could (except for scoping issues) equivalently be written:
sub foo LEAVE {...} {
...
}
Though why you'd want to that, I don't know. Hmm, if we really
generalize trait verbs like that, then you could also write things
like:
sub foo {
is signature ('int $x');
is cached;
returns Int;
...
}
That's gettin' a little out there. Maybe we won't generalize it quite
that far...
Delegation
Delegation is the art of letting someone else do your work for you.
The fact that you consider it "your" work implies that delegation is
actually a means of taking credit in advance for what someone else is
going to do. In terms of objects, it means pretending that some other
object's methods are your own. Now, as it happens, you can always do
that by hand simply by writing your own methods that call out to
another object's methods of the same name. So any shorthand for doing
that is pure syntactic sugar. That's what we're talking about here.
Delegation in this sugary sense always requires there to be an
attribute to keep a reference to the object we're delegating to. So
our syntactic relief will come in the form of annotations on a ""has""
declaration. We could have decided to instead attach annotations to
each method declaration associated with the attribute, but by the time
you do this, you've repeated so much information that you almost might
as well have written the non-sugary version yourself. I know that for
a fact, because that's how I originally proposed it. ":-)"
Delegation is specified by a "handles" trait verb with an argument
specifying one or more method names that the current object and the
delegated object will have in common:
has $:tail handles 'wag';
Since the method name (but nothing else) is known at class construction
time, the following ".wag" method is autogenerated for you:
method wag (*@args is context(Lazy)) { $:tail.wag(*@args) }
(It's necessary to specify a "Lazy" context for the arguments to a such
a delegator method because the actual signature is supplied by the
tail's ".wag" method, not your method.)
[Update: that's more like:
method wag (\$args) { $!tail.wag(*$args) }
nowadays.]
So as you can see, the delegation syntax already cuts our typing in
half, not to mention the reading. The win is even greater when you
specify multiple methods to delegate:
has $:legs handles Xwalk run lope shake peeX;
Or equivalently:
has $:legs handles ['walk', 'run', 'lope', 'shake', 'pee'];
You can also say things like
my @legmethods := Xwalk run lope shake peeX;
has $:legs handles (@legmethods);
since the ""has"" declaration is evaluated at class construction time.
Of course, it's illegal to call the outer method unless the attribute
has been initialized to an object of a type supporting the method. So
a declaration that makes a new delegatee at object build time might be
specified like this:
has $:tail handles 'wag' will build { Tail.new(*%_) };
or, equivalently,
has $:tail handles 'wag' = { Tail.new(*%_) };
This automatically performs
$:tail = Tail.new(*%_);
when "BUILD" is called on a new object of the current class (unless
"BUILD" initializes $:tail to some other value). Or, since you might
want to declare the type of the attribute without duplicating it in the
default value, you can also say
has Tail $:tail handles 'wag' = { .new(*%_) };
or
has Tail $:tail handles 'wag' will build { .new(*%_) };
Note that putting a "Tail" type on the attribute does not necessarily
mean that the method is always delegated to the "Tail" class. The
dispatch is still based on the run-time type of the object, not the
declared type. So
has Tail $:tail handles 'wag' = { LongTail.new(*%_) };
delegates to the "LongTail" class, not the "Tail" class. Of course,
you'll get an exception at build time if you try to say:
has Tail $:tail handles 'wag' = { Dog.new(*%_) };
since "Dog" is not derived from "Tail" (whether or not the tail can wag
the dog).
We declare $:tail as a private attribute here, but "$.tail" would have
worked just as well. A "Dog"'s tail does seem to be a public
interface, after all. Kind of a read-only accessor.
Wildcard Delegation
We've seen that the argument to ""handles"" can be a string or a list
of strings. But any argument or subargument that is not a string is
considered to be a smartmatch selector for methods. So you can say:
has $:fur handles /^get_/;
and then you can do the ".get_wet" or ".get_fleas" methods (presuming
there are such), but you can't call the ".shake" or ".roll_in_the_dirt"
methods. (Obviously you don't want to delegate the ".shake" method
since that means something else when applied to the "Dog" as a whole.)
If you say
has $:fur handles Groomable;
then you get only those methods available via the "Groomable" role or
class.
Wildcard matches are evaluated only after it has been determined that
there's no exact match to the method name. They therefore function as
a kind of autoloading in the overall pecking order. If the class also
has an "AUTOLOAD", it is called only if none of the wildcard
delegations match. (An "AUTOMETHDEF" is called much earlier, since it
knows from the stub declarations whether there is supposed to be a
method of that name. So you can think of explicit delegation as a kind
of autodefine, and wildcard delegation as a kind of autoload.)
When you have multiple wildcard delegations to different objects, it's
possible to have a conflict of method names. Wildcard method matches
are evaluated in order, so the earliest one wins. (Non-wildcard method
conflicts can be caught at class composition time.)
Renaming Delegated Methods
If, where you would ordinarily specify a string, you put a pair, then
the pair maps the method name in this class to the method name in the
other class. If you put a hash, each key/value pair is treated as such
a mapping. Such mappings are not considered wildcards.
has $:fur handles { :shakefurXshakeX :scratchXget_fleasX };
Perhaps that reads better with the old pair notation:
has $:fur handles { shakefur => 'shake', scratch => 'get_fleas' };
You can do a wildcard renaming, but not with pairs. Instead do
smartmatch with a substitution:
has $:fur handles (s/^furget_/get_/);
As always, the left-to-right mapping is from this class to the other
one. The pattern matching is working on the method name passed to us,
and the substituted method name is used on the class we delegate to.
Delegation without an Attribute
Ordinarily delegation is based on an attribute holding an object
reference, but there's no reason in principle why you have to use an
attribute. Suppose you had a "Dog" with two tails. You can delegate
based on a method call:
method select_tail handles Xwag hangX {...}
The arguments are sent to both the delegator and delegatee method. So
when you call
$dog.wag(:fast)
you're actually calling
$dog.select_tail(:fast).wag(:fast)
If you use a wildcard delegation based on a method, you should be aware
that it has to call the method before it can even decide whether
there's a valid method call to the delegatee or not. So it behooves
you not to get too fancy with "select_tail()", since it might just have
to throw all that work away and go on to the next wildcard
specification.
Delegation of Handlers
If your delegation object happens to be an array:
has @:handlers handles 'foo';
then something cool happens. <cool rays> In this case Perl 6 assumes
that your array contains a list of potential handlers, and you just
want to call the first one that succeeds. This is not considered a
wildcard match unless the "handles" argument forces it to be.
Note that this is different from the semantics of a hyper method such
as "@objectsX.foo()", which will try to call the method on every object
in @objects. If you want to do that, you'll just have to write your
own method:
has @:ears;
method twitchears () { @:earsX.twitch() }
Life is hard.
Hash-Based Redispatch
If your delegation object happens to be a hash:
has %:objects handles 'foo';
then the hash provides a mapping from the string value of "self" to the
object that should be delegated to:
has %:barkers handles "bark" =
(Chihauhau => $yip,
Beagle => $yap,
Terrier => $arf,
StBernard => $woof,
);
method prefix:~( return "$.breed" )
[Update: That's "prefix:<~>" now.]
If the string is not found in the hash, a ""next METHOD"" is
automatically performed.
Again, this construct is not necessarily considered a wildcard. In the
example above we know for a fact that there's supposed to be a ".bark"
method somewhere, therefore a specific method can be autogenerated in
the current class.
Relationship to Roles
Delegation is a means of including a set of methods into your class.
Roles can also include a set of methods in your class, but the
difference is that what a role includes happens at class composition
time, while delegation is much more dynamic, depending on the current
state of the the delegating attribute (or method).
But there's no reason you can't have your cake and eat it too, because
roles are specifically designed to allow you to pull in delegations
without the class even being aware of the fact that it's delegating.
When you include a role, you're just signing up for a set of methods,
with maybe a little state thrown in. You don't care whether those
methods are defined directly, or indirectly. The role manages that.
In fact, this is one of the primary motivators for including roles in
the design of Perl 6. As a named abstraction, a role lets you refactor
all the classes using that role without changing any of the classes
involved. You can turn your single "god" object into a set of nicely
cooperating objects transparently. Well, you have to do the
composition using roles first, and that's not transparent.
Note that all statically named methods are dispatched before any
wildcard methods, regardless of whether the methods came from a role or
the class itself. (Inherited methods also come before wildcard methods
because we order all the cachable method dispatches before all the non-
cachable ones. But see below.) So the lookup order is:
1. This class's declared methods (including autodefs and delegations)
2. An included role's declared methods (including autodefs and
delegations)
3. Normal inherited methods (including autodefs and delegations of the
parent class)
4. Wildcard delegated methods in this class (or failing that, from any
inherited class that does wildcard delegations)
5. Methods autoloaded by an autoloader defined in this class (or
failing that, an autoloader from any inherited class)
Note that any method that is stubbed (declared but not yet defined) in
steps 1 or 2 skips straight to step 4, because it means this class
thinks it "owns" a method of that name. (At this point Perl 5 would
skip straight to step 5, but Perl 6 still wants to do wildcard
delegation before falling back on inherited autoloading.)
Anonymous Delegation for ISA Emulation
When you inherit from a class with a different layout policy, Perl has
to emulate inheritance via anonymous delegation. In this case it
installs a wildcard delegation for you. According to the list above,
this gives precedence to all methods with the same layout policy over
all methods with a different layout policy. This might be a feature,
especially when calling cross-language. Then again, maybe it isn't.
There is no ""has"" variable for such an anonymous delegation. Its
delegated object is stored as a property on the class's entry in the
ISA list, probably. (Or we could autogenerate an attribute whose name
is related to the class name, I suppose.)
Since one of the primary motivations for allowing this is to make it
possible to call back and forth between Perl 5 and Perl 6 objects, we
need to make that as transparent as possible. When a Perl 6 object
inherits from a Perl 5 object, it is emulated with delegation. The
invocant passed into the Perl 5 (Ponie) object looks like a Perl 5
object to Perl 5. However, if the Perl 5 object passes that as an
invocant back into Perl 6, it has to go back to looking like a Perl 6
object to Perl 6, or our emulation of inheritance is suboptimal. When
a Ponie object accesses its attributes through what it thinks is a hash
reference, it really has to call the appropriate Perl 6 accessor
function if the object comes from Perl 6. Likewise, when Perl 6 calls
an accessor on a Perl 5 object, it has to translate that method call
into a hash lookup--presuming that the Perl 5 object is implemented as
a blessed hash.
Other language boundaries may or may not do similar tricks. Python's
attributes suffer from the same misdesign as Perl 5's attributes. (My
fault for copying Python's object model. ":-)" So that'd be a good
place for a similar policy.
So we can almost certainly emulate inheritance with delegation, albeit
with some possible misordering of classes if there are duplicate method
names. However, the hard part is constructing objects. Perl 5 doesn't
enforce a policy of named arguments for its constructors, so it is
difficult for a Perl 6 "BUILDALL" routine to have any automatic way to
call a Perl 5 constructor. It's tempting to install glue code into the
Perl 6 class that will do the translation, but that's really not a good
idea, because someday the Perl 5 class may eventually get translated to
a Perl 6 class, and your glue code will be useless, or worse.
So the right place to put the glue is actually back into the Perl 5
class. If a Perl 5 class defines a "BUILD" subroutine, it will be
assumed that it properly handles named pairs in Perl 5's even/odd list
format. That will be used in lieu of any predefined constructor named
""new"" or anything else.
If there is no "BUILD" routine in the Perl 5 package, but there is a
""use fields"" declaration, then we can autogenerate a rudimentary
"BUILD" routine that should suffice for most scalar attributes.
Types and Subtypes
I've always really liked the Ada distinction between types and
subtypes. A type is something that adds capabilities, while a subtype
is something that takes away capabilities. Classes and roles generally
function as types in Perl 6. In general you don't want to make a
subclass that, say, restricts your integers to only even numbers,
because then you've violated Liskov substitutability. In the same way
that we force role composition to be "before" classes, we will force
subtyping constraints to be "after" classes. In both cases we force it
by a declarator change so that you are unlikely to confuse a role with
a class, or a class with a subtype. And just as you aren't allowed to
derive a role from a class, you aren't allowed to derive a class from a
constrained type.
On the other hand, a bit confusingly, it looks like subtyping will be
done with the "type" keyword, since we aren't using that word yet.
To remind people that a subtype of a class is just a constrained alias
for the class, we avoid the "is" word and declare a type using a "::="
compile-time alias, like this:
type Str_not2b ::= Str where /^[isnt|arent|amnot|aint]$/;
The "::=" doesn't create the type, nor in fact does the "type" keyword.
It's actually the "where" that creates the type. The "type" keyword
just marks the name as "not really a classname" so that you don't
accidentally try to derive from it.
[Update: I decided I don't like the forced use of "::=", nor do I like
the confusion engendered by use the word "type" to mean "subtype", so
the syntax is now any of:
my subset Str_not2b of Str where /^[isnt|arent|amnot|aint]$/;
my Str subset Str_not2b where /^[isnt|arent|amnot|aint]$/;
.]
Since a type is "post-class-ical", there's really no such thing as an
object blessed into a type. If you try it, you'll just end up with an
object blessed into whatever the underlying unconstrained class is, as
far as inheritance is concerned. A type is not a subclass. A type is
primarily a handy way of sneaking smartmatching into multiple dispatch.
Just as a role allows you to specify something more general than a
class, a type allows you to specify something more specific than a
class.
While types are primarily intended for restricting parameter types for
multiple dispatch, they also let you impose preconditions on
assignment. Basically, if you declare any container with a subtype,
Perl will check the constraint against any value you might try to bind
or assign to the container.
type Str_not2b ::= Str where /^[isnt|arent|amnot|aint]$/;
type EvenNum ::= Num where { $^n % 2 == 0 }
my Str_not2b $hamlet;
$hamlet = 'isnt'; # Okay because 'isnt' ~~ /^[isnt|arent|amnot|aint]$/
$hamlet = 'amnt'; # Bzzzzzzzt! 'amnt' !~ /^[isnt|arent|amnot|aint]$/
my EvenNum $n;
$n = 2; # Okay
$n = -2; # Okay
$n = 0; # Okay
$n = 3; # Bzzzzzzzt
It's perfectly legal to base one subtype on another. It merely adds an
additional constraint.
It's possible to use an anonymous subtype in a signature:
use Rules::Common :profanity;
multi sub mesg (Str where /<profanity>/ $mesg is copy) {
$mesg ~~ s:g/<profanity>/[expletive deleted]/;
print $MESG_LOG: $mesg;
}
multi sub mesg (Str $mesg) {
print $MESG_LOG: $mesg;
}
Given a set of multimethods that would "tie" on the actual classes of
the arguments, a multimethod with a matching constraint will be
preferred over an equivalent one with no constraint. So the first
"mesg" above is preferred if the constraint matches, and otherwise the
second is preferred. However, if two multis with constraints match
(and are otherwise equivalent), it's just as if you'd called any other
set of ambiguous multimethods, and one of them had better be marked as
the default, or you die.
We say that types are "post-class-ical", but since you can base them
off of any class including "Any", they are actually rather orthogonal
to the class system.
[Update: Everywhere the preceding section says "type", change to
"subtype", except for the keyword, which changes to "subset". And
change "::=" to "of".]
Enums
An enum functions as a subtype that is constrained to a single value.
(When a subtype is constrained to a single value, it can be used for
that value.) But rather than declaring it as:
type DayOfWeek ::= Int where 0..6;
type DayOfWeek::Sunday ::= DayOfWeek where 0;
type DayOfWeek::Monday ::= DayOfWeek where 1;
type DayOfWeek::Tuesday ::= DayOfWeek where 2;
type DayOfWeek::Wednesday ::= DayOfWeek where 3;
type DayOfWeek::Thursday ::= DayOfWeek where 4;
type DayOfWeek::Friday ::= DayOfWeek where 5;
type DayOfWeek::Saturday ::= DayOfWeek where 6;
we allow a shorthand:
type DayOfWeek ::= int enum
XSunday Monday Tuesday Wednesday Thursday Friday SaturdayX;
[Update: The syntax is now more like existing declarations:
our int enum DayOfWeek
<Sunday Monday Tuesday Wednesday Thursday Friday Saturday>;
where "int" is usually omitted.]
Type "int" is the default enum type, so that can be:
type DayOfWeek ::= enum
XSunday Monday Tuesday Wednesday Thursday Friday SaturdayX;
[Update: Now just "enum DayOfWeek <...>".]
The enum installer inspects the strings you give it for things that
look like pairs, so to number your days from 1 to 7, you can say:
type DayOfWeek ::= enum
X:Sunday(1) Monday Tuesday Wednesday Thursday Friday SaturdayX;
You can import individual enums into your scope where they will
function like argumentless constant subs. However, if there is a name
collision with a sub or other enum, you'll have to disambiguate.
Unambiguous enums may be used as a property on the right side of a
"but", and the enum type can be intuited from it to make sure the
object in question has the right semantics mixed in. Two builtin enums
are:
type bool ::= bit enum Xfalse trueX;
type taint ::= bit enum Xuntainted taintedX;
[Update: Now just:
our bit enum *bool is <False True>;
our bit enum *taint is <Untainted Tainted>;
.]
Open vs Closed Classes
By default, classes in Perl are left open. That is, you can add more
methods to them, though you have to be explicit that that is what
you're doing:
class Object is extended {
method wow () { say "Wow, I'm an object." }
}
Otherwise you'll get a class redefinition error.
Likewise, a "final" class (to use the Java term) is one that you know
will never be derived from, let alone mucked with internally.
Now, it so happens that leaving all your classes open is not terribly
conducive to certain kinds of optimization (let alone encapsulation).
From the standpoint of the compiler, you'd like to be able to say, "I
know this class will never be derived from or modified, so I can do
things like access my attributes directly without going through virtual
accessors." We were, in fact, tempted to make closed classes the
default. But this breaks in frameworks like mod_perl where you cannot
predict in advance which classes will want to be extended or derived
from.
Some languages solve this (or think they solve it) by letting classes
declare themselves to be closed and/or final. But that's actually a
bad violation of OO principles. It should be the users of a class that
decide such things--and decide it for themselves, not for others. As
such, there has to be a consensus among all users of a class to close
or finalize it. And as we all know, consensus is difficult to achieve.
Nevertheless, the Perl 6 approach is to give the top-level application
the right to close (and finalize) classes. But we don't do this by
simply listing the classes we want to close. Instead, we use the
sneaky strategy of switching the default to closed and then list the
classes we want to stay open.
The benefit of this is that modules other than the top level can simply
list all the classes that they know should stay open. In an open
framework, these are, at worst, no-ops, and they don't cause classes to
close that other modules might want to remain open. If any module
requests a class to stay open, it stays open. If any module requests
that a class remain available as a base class, it remains available.
It has been speculated that optimizer technology in Parrot will develop
such that a class can conjecturally be compiled as closed, and then
recompiled as open should the need arise. (This is just a specific
case of the more general problem of what you do whenever the
assumptions of the optimizer are violated.) If we get such an on-the-
fly optimizer/pessimizer, then our open class declarations are still
not wasted--they will tell the optimizer which classes not to bother
trying to close or finalize in the first place. Setting the default
the other way wouldn't have the same benefit.
Syntax? You want syntax? Hmm.
use classes :closed :openXMammal InsectX;
[Update: Now more like ""use opt :classes(:close :finalize);"", since
it's direct instruction to the optimizer. It doesn't directly change
the meaning of the "class" keyword, so it shouldn't use "class" as the
pragma name.]
Or some such. Maybe certain kinds of class reference automatically
request the class to be open without a special pragma. A module could
request open classes without attempting to close everything with just:
use classes :openXMammal InsectX;
[Update: Just ""class"" to match the keyword.]
On the other hand, maybe that's another one of those inside-out
interfaces, and it should just be options on the classes whose
declarations you have to include anyway:
use classes :closed;
class Mammal is open {...}
class Insect is open {...}
Similarly, we can finalize classes by default and then "take it back"
for certain classes:
use classes :final;
class Mammal is base {...}
class Insect is base {...}
In any event, even though the default is expressed at the top of the
main application, the final decision on each class is not made by the
compiler until "CHECK" time, when all the compiled code has had a
chance to stake its claims. (A JIT compiler might well wait even
longer, in case run-time evaluated code wishes to express an opinion.)
Interface Consistency
In theory, a subclass should always act as a more specialized version
of a superclass. In terms of design-by-contract theory, a subclass
should OR in its preconditions and AND in its postconditions. In terms
of Liskov substitutability, you should always be able to substitute a
derived class object in where a base class object is expected, and not
have it blow up. In terms of Internet policy, a derived class
(compared to its base class) should be at least as lenient in what it
accepts, and at least as strict in what it emits.
So, while it would be lovely in a way to require that derived methods
of the same name as a base method must use the same signature, in
practice that doesn't work out. A derived class often has to be able
to add arguments to the signature of a method so that it can "be more
lenient" in what it accepts as input.
But this poses a problem, insofar as the user of the derived object
does not know whether all the methods of a given name support the same
interface. Under "SUPER" semantics, one can at least assume that the
derived class will "weed out" any arguments that would be detrimental
to its superclass. But as we have already pointed out, there isn't a
single superclass under MI, and each superclass might need to have
different "detrimental arguments" weeded out. One could say that in
that case, you don't call "SUPER" but rather call out to each
superclass explicitly. But then you're back to the problem that
"SUPER" was designed to solve. And you haven't solved "SUPER"'s
problem either.
Under "NEXT" semantics, we assume that we are dispatching to a set of
methods with the same name, but potentially different signatures.
(Perl 6's "SUPER" implementation is really a limited form of "NEXT",
insofar as "SUPER" indicates a set of parent methods, unlike in Perl 5
where it picks one.) We need a way of satisfying different signatures
with the same set of arguments.
There are, in fact, two ways to approach this. One way is to say, okay
everything is a multimethod, and we just won't call anything whose
signature is irreconcilably inconsistent with the arguments presented.
Plus there are varying degrees of consistency within the set of
"consistent" interfaces, so we try them in decreasing order of
consistency. A more consistent multi is allowed to fall back to a less
consistent multi with ""next METHOD"".
But as a variant of the "pick one" mentality, that still doesn't help
the situation where you want to send a message to all your ancestor
classes (like "Please Mr. Base Class, help me initialize this
object."), but you want to be more specific with some classes than
others ("Please Miss Derived Class, set your "$.prim" attribute to
1."). So the other approach is to use named arguments that can be
ignored by any classes that don't grok the argument.
So what this essentially comes down to is the fact that all methods and
submethods of classes that might be derived from (which is essentially
all classes, but see the previous section) must have a "*%" parameter,
either explicitly or implicitly, to collect up and render harmless any
unrecognized option pairs in the argument list. So the ruling is that
all methods and submethods that do not declare an explicit "*%"
parameter will get an implicit *%_ parameter declared for them whether
they like it or not. (Subroutines are not granted this "favor".)
It might be objected that this will slow down the parameter binding
algorithm for all methods favored with an implicit *%_, but I would
argue that the binding code doesn't have to do anything till it sees a
named parameter it doesn't recognize, and then it can figure out
whether the method even references %_, and if not, simply throw the
unrecognized argument away instead of constructing a %_ that won't be
used. And most of this "figuring out" can be done at compile time.
Another counterargument is that this prevents a class from recognizing
typos in argument names. That's true. It might be possible to ask for
a warning that checks globally (at class-finalization time in the
optimizer?) to see if there is any method of that name anywhere that is
interested in a parameter of that name. But any class that gets its
parameters out of a "*%" hash at run time would cause false positives,
unless we assume that any "*%" hash makes any argument name legal, in
which case we're pretty much back to where we started, unless we do
analysis of the usage of all "*%" hash in those methods, and count
things like %_XprimX as proper parameter declarations. And that can
still be spoofed in any number of ways. Plus it's not a trivial
warning to calculate, so it probably wouldn't be the default in a load-
and-go interpreter.
So I think we basically have to live with possible typos to get proper
polymorphic dispatch. If something is frequently misspelled, then you
could always put in an explicit test against %_ for that argument:
warn "Didn't you mean :the(%_XtehX)?" if %_XtehX;
And perhaps we could have a pragma:
use signatures :exact;
But it's possible that the correct solution is to differentiate two
kinds of "isa", one that derives from "nextish" classes, and one that
derives from "superish" classes. A ""next METHOD"" traversal would
assume that any delegation to a super class would be handled explicitly
by the current class's methods. That is, a "superish" inheritance
hides the base class from ".*" and ".+", as well as ""next METHOD"".
On the other hand, if we marked the super class itself, we could
refrain from generating "*%" parameters for its methods. Any "next"
dispatcher would then have to "look ahead" to see if the next class was
a "superish" class, and bypass it. I haven't a clue what the syntax
should be though. We could mark the class with a "superish" trait,
which wouldn't be inheritable. Or we could mark it with a Superish
role, which would be inheritable, and a base class would have to
override it to impose a Nextish role instead. (But then what if one
parent class is Superish and one is Nextish?) Or we could even have
two different metaclasses, if we decide the two kinds of classes are
fundamentally different beasts. In that case we'd declare them
differently using "class" and some other keyword. Of course, people
will want to use "class" for the type they prefer, and the other
keyword for the type they don't prefer. :-)
But since we're attempting to bias things in favor of nextish
semantics, that would be a "class", and the superish semantics might be
a "guthlophikralique" or some such. ":-)"
Seriously, if we mark the class, ""is hidden"" can hide the current
class from ""next METHOD"" semantics. The problem with that is, how do
you apply the trait to a class in a different language? That argues
for marking the "isa" instead. So as usual when we can't make up our
minds, we'll just have it both ways. To mark the class itself, use
""is hidden"". To mark the "isa", use ""hides Base"" instead of ""is
Base"". In neither case will ""next METHOD"" traverse to such a class.
(And no *%_ will be autogenerated.)
For example, here are two base classes that know about ""next METHOD"":
class Nextish1 { method dostuff() {...; next;}
class Nextish2 { method dostuff() {...; next;}
class MyClass is Nextish1 is Nextish2 {
method dostuff () {...; next;}
}
Since all the base classes are "next-aware", "MyClass" knows it can
just defer to "next" and both parent classes' "dostuff" methods will be
called. However, suppose one of our base classes is old-fashioned and
thinks it should call things with "SUPER::" instead. (Or it's a class
off in Python or Ruby.) Then we have to write our classes more like
this:
class Superish { method dostuff(...; .*SUPER::dostuff(); }
class Nextish { method dostuff() {...; next;}
class MyClass hides Superish is Nextish {
method dostuff () {
.Superish::dostuff(); # do Superish::dostuff()
next; # do Nextish::dostuff()
}
}
Here, "MyClass" knows that it has two very different base classes.
"Nextish" knows about ""next"", and "Superish" doesn't. So it
delegates to "Superish::dostuff()" differently than it delegates to
"Nextish::dostuff()". The fact that it declared ""hides Superish""
prevents "next" from visiting the Superish class.
Collections of Classes
In Classes
We'd like to be able to support virtual inner classes. You can't have
virtual inner classes unless you have a way to dispatch to the actual
class of the invocant. That says to me that the solution is bound up
intimately with the method dispatcher, and the syntax of naming an
inner class has to know about the invocant in whose context we have to
start searching for the inner class. So we could have an explicit
syntax like:
class Base {
our class Inner { ... }
has Inner $inner;
submethod BUILD { .makeinner; }
method makeinner ($self:){
my Inner $thing .= $self.Inner.new();
return $thing;
}
}
class Middle is Base {
our class Inner is Base::Inner { ... }
}
class Derived is Middle {
}
When you say "Derived.new()", it creates a "Derived" object, calls
"Derived::BUILDALL", which eventually calls "Base::BUILD", which makes
a "Middle::Inner" object (because that's what the virtual method
"$self.Inner" returns) and puts it in a variable that of the
"Base::Inner" type (which is fine, since "Middle::Inner" ISA
"Base::Inner". Whew!
The only extra magic here is that an inner class would have to
autogenerate an accessor method (of the same name) that returns the
class. A class could then choose to access an inner class name
directly, in which case it would get its own inner class of that name,
much like "$.foo" always gets you your own attribute. But if you
called the inner class name as a method, it would automatically
virtualize the name, and you'd get the most derived existing version of
the class.
This would give us most of what RFC 254 is asking for, at the expense
of one more autogenerated method. Use of such inner classes would take
the connivance of a base class that doesn't mind if derived classes
redefine its inner class. Unfortunately, it would have to express that
approval by calling "$self.Inner" explicitly. So this solution does
not go as far as letting you change classes that didn't expect to be
changed.
It would be possible to take it further, and I think we should. If we
say that whenever you use any global class, it makes an inner class on
your behalf that is merely an alias to the global class, creating the
accessor method as if it were an inner class, then it's possible to
virtualize the name of any class, as long as you're in a context that
has an appropriate invocant. Then we'd make any class name lookup
assume "$self." on the front, basically.
This may seem like a wild idea, but interestingly, we're already
proposing to do a similar aliasing in order to have multiple versions
of a module running simultaneously. In the case of classes, it seems
perfectly natural that a new version might derive from an older version
rather than redefining everything.
The one fly in the ointment that I can see is that we might not always
have an appropriate invocant--for instance, outside any method body,
when we're declaring attributes. I guess when there's no dynamic
context indicating what an "inner" classname should mean, it should
default to the ordinary meaning in the current lexical and/or package
context. Within a class definition, for instance, the invocant is the
metaclass, which is unhelpful. So generally that means that a declared
attribute type will turn out to be a superclass of the actual attribute
type at run time. But that's fine, ain't it? You can always store a
"Beagle" in a "Dog" attribute.
So in essence, it boils down to this. Within a method, the invocant is
allowed to have opinions about the meanings of any class names, and
when there are multiple possible meanings, pick the most appropriate
one, where that amounts to the name you'd find if the class name were a
virtual method name.
Here's the example from RFC 254, translated to Perl 6 (with "Frog" made
into an explicit inner class for clarity (though it should work with
any class by the aliasing rule above)):
class Forest {
our class Frog {
method speak () { say "ribbit ribbit"; }
method jump () {...}
method croak () {...} # ;-)
}
has Frog $.frog;
method new ($class) {
my Frog $frog .= new; # MAGIC
return $class.bless( frog => $frog );
}
sub make_noise {
.frog.speak; # prints "ribbit ribbit"
}
}
Now we derive from "Forest", producing "Forest::Japanese", with its own
kind of frogs:
class Forest::Japanese is Forest {
our class Frog is Forest::Frog {
method speak () { say "kerokero"; }
}
}
And finally, we make a forest of that type, and tell it to make a
noise:
$forest = new Forest::Japanese;
$forest.make_noise(); # prints "kerokero"
In the Perl 5 equivalent, that would have printed "ribbit ribbit"
instead. How did it do the right thing in Perl 6?
The difference is on the line marked ""MAGIC"". Because "Frog" was
mentioned in a method, and the invocant was of type "Forest::Japanese"
rather than of type "Forest", the word ""Frog"" figured out that it was
supposed to mean a "Forest::Japanese::Frog" rather than a
"Forest::Frog". The name was "virtual". So we ended up creating a
forest with a frog of the appropriate type, even though it might not
have occurred to the writer of "Forest" that a subclass would override
the meaning of "Frog".
So one object can think that its "Frog" is Japanese, while another
thinks it's Russian, or Mexican, or even Antarctican (if you can find
any forests there). Base methods that talk about "Frog" will
automatically find the "Frog" appropriate to the current invocant.
This works even if "Frog" is an outer class rather than an inner class,
because any outer class referenced by a base class is automatically
aliased into the class as a fake inner class. And the derived class
doesn't have to redefine its "Frog" by declaring an inner class either.
It can just alias (or use) a different outer "Frog" class in as its
fake inner class. Or even a different version of the same "Frog"
class, if there are multiple versions of it in the library.
And it just works.
In Modules
It's also possible to put a collection of classes into a module, but
that doesn't buy you much except the ability to pull them all in with
one "use", and manage them all with one version number. Which has a
lot to be said for it--in the next section.
Versioning
Way back at the beginning, we claimed that a file-scoped class
declaration:
class Dog;
...
is equivalent to the corresponding block-scoped declaration:
class Dog {...}
While that's true, it isn't the whole truth. A file-scoped class (or
module, or package) is the carrier of more metadata than a block-scoped
declaration. Perl 6 supports a notion of versions that is file based.
But even a class name plus a version is not sufficient to name a
module--there also has to be a naming authority, which could be a URI
or a CPAN id. This will be discussed more fully in Apocalypse 11, but
for now we can make some predictions.
The extra metadata has to be associated with the file somehow. It may
be implicit in the filename, or in the directory path leading to the
file. If so, then Perl 6 has to collect up this information as modules
are loaded and associate it with the top level class or module as a set
of properties.
It's also possible that a module could declare properties explicitly to
define these and other bits of metadata:
author http://www.some.com/~jrandom
version 1.2.1
creator Joe Random
description This class implements camera obscura.
subject optics, boxes
language ja_JP
licensed Artistic|GPL
Modules posted to CPAN or entered into any standard Perl 6 library are
required to declare some set of these properties so that installations
can know where to keep them, such that multiple versions by different
authors can coexist, all of them available to any installed version of
Perl. (This is a requirement for any Perl 6 installation. We're tired
of having to reinstall half of CPAN every time we patch Perl. We also
want to be able to run different versions of the "Frog" module
simultaneously when the "Frog" requirements of the modules we use are
contradictory.)
It's possible that the metadata is supplied by both the declarations
and by the file's name or location in the library, but if so, it's a
fatal error to use a module for which those two sources contradict each
other as to author or version. (In theory, it could also be a fatal
error to use modules with incompatible licensing, but a kind warning
might be more appreciated.) Likely there will also be some kind of
automatic checksumming going on as well to prevent fraudulent
distributions of code.
It might simplify things if we make an "identifier" metadatum that
incorporates all of naming authority, package name, and version. But
the individual parts still have to be accessible, if only as components
of "identifier". However we structure it, we should make the
"identifier" the actual declared full name of the class, yet another
one of those "long names" that include extra parameters.
Version Declarations
The syntax of a versioned class declaration looks like this:
class Dog-1.2.1-cpan:JRANDOM;
class Dog-1.2.1-http://www.some.com/~jrandom;
class Dog-1.2.1-mailto:jrandom@some.com;
Perhaps those could also have short forms, presuming we can distinguish
CPAN ids, web pages, and email addresses by their internal forms.
class Dog-1.2.1-JRANDOM;
class Dog-1.2.1-www.some.com/~jrandom;
class Dog-1.2.1-jrandom@some.com;
Or maybe using email addresses is a bad idea now in the modern Spam
Age. Or maybe Spam Ages should be plural, like the Dark Ages...
In any event, such a declaration automatically aliases the full name of
the class (or module) to the short name. So for the rest of the scope,
"Dog" refers to the longer name.
(Though if you refer to "Dog" within a method, it's considered a
virtual class name, so Perl will search any derived classes for a
redefined inner "Dog" class (or alias) before it settles on the least-
derived aliased "Dog" class.)
We lied slightly when we said earlier that only the file-scoped class
carries extra metadata. In fact, all of the classes (or modules, or
packages) defined within your file carry metadata, but it so happens
that the version and author of all your extra classes (or modules, or
packages) are forced to be the same as the file's version and author.
This happens automatically, and you may not override the generation of
these long names, because if you did, different file versions could and
would have version collisions of their interior components, and that
would be catastrophic. In general you can ignore this, however, since
the long names of your extra classes are always automatically aliased
back down to the short names you thought you gave them in the first
place. The extra bookkeeping is in there only so that Perl can keep
your classes straight when multiple versions are running at the same
time. Just don't be surprised when you ask for the name of the class
and it tells you more than you expected.
Use of Version and Author Wildcards
Since these long names are the actual names of the classes, when you
say:
use Dog;
you're really asking for something like:
use Dog-(Any)-(Any);
And when you say:
use Dog-1.2.1;
you're really asking for:
use Dog-1.2.1-(Any);
Note that the 1.2.1 specifies an exact match on the version number.
You might think that it should specify a minimum version. However,
people who want stable software will specify an exact version and stick
with it. They don't want 1.2.1 to mean a minimum version. They know
1.2.1 works, so they want that version nailed down forever--at least
for now.
To match more than one version, put a range operator in parens:
use Dog-(1.2.1..1.2.3);
use Dog-(1.2.1..^1.3);
use Dog-(1.2.1...);
What goes inside the parens is in fact any valid smartmatch selector:
use Dog-(1.2.1 | 1.3.4)-(/:i jrandom/);
use Dog-(Any)-(/^cpan\:/)
And in fact they could be closures too. These means the same thing:
use Dog-{$^ver ~~ 1.2.1 | 1.3.4}-{$^auth ~~ /:i jrandom/};
use Dog-{$^ver ~~ Any}-{$^auth ~~ /^cpan\:/}
In any event, however you select the module, its full name is
automatically aliased to the short name for the rest of your lexical
scope. So you can just say
my Dog $spot .= new("woof");
and it knows (even if you don't) that you mean
my Dog-1.3.4-cpan:JRANDOM $spot .= new("woof");
(Again, if you refer to "Dog" within a method, it's a virtual class
name, so Perl will search any derived classes for a redefined "Dog"
class before it settles on the outermost aliased "Dog" class.)
[Update: you can also prefix the module name with the language to
borrow it from, like ""use perl5:DBI;"".]
Introspection
It's easy to specify what Perl 6 will provide for introspection: the
union of what Perl 6 needs and whatever Parrot provides for other
languages. ";-)"
In the particular case of class metadata, the interface should
generally be via the class's metaclass instance--the object of type
"MetaClass" that was in charge of building the class in the first
place. The metamethods are in the metaobject, not in the class object.
(Well, actually, those are the same object, but a class object ignores
the fact that it's also a metaobject, and dispatches by default to its
own methods, not the ones defined by the metaclass.)
[Update: It's not true that the class object is the same as the
metaobject.]
To get to the metamethods of an ordinary class object you have to use
the ".meta" method:
MyClass.getmethods() # call MyClass's .getmethods method
MyClass.meta.getmethods() # get the method list of MyClass
Unless "MyClass" has defined or inherited a ".getmethods" method, the
first call is an error. The second is guaranteed to work for Perl 6's
standard "MetaClass" objects. You can also call ".meta" on any
ordinary object:
$obj.meta.getmethods();
That's equivalent to
$obj.dispatcher.meta.getmethods();
[Update: There's no ".dispatcher" since all objects have a ".meta" now,
and all objects of a class including class objects are of the same
type.]
As for which parts of a class are considered metadata--they all are, if
you scratch hard enough. Everything that is not stored directly as a
trait or property really ought to have some kind of trait-like method
to access it. Even the method body closures have to be accessible as
traits, since the ".wrap" method needs to have something to put its
wrapper around.
Minimally, we'll have user-specified class traits that look like this:
identifier Dog-1.2.1-http://www.some.com/~jrandom
name Dog
version 1.2.1
authority http://www.some.com/~jrandom
author Joe Random
description This class implements camera obscura.
subject optics, boxes
language ja_JP
licensed Artistic|GPL
And there may be internal traits like these:
isa list of parent classes
roles list of roles
disambig how to deal with ambiguous method names from roles
layout P6opaque, P6hash, P5hash, P5array, PyDict, Cstruct, etc.
The "layout" determines whether one class can actually derive from
another or has to fake it. Any P6opaque class can compatibly inherit
from any other P6opaque class, but if it inherits from any P5 class, it
must use some form of delegation to another invocant. (Hopefully with
a smart enough invocant reference that, if the delegated object
unknowingly calls back into our layout system, we can recover the
original object reference and maintain some kind of compositional
integrity.)
The metaclass's ".getmethods" method returns method-descriptor objects
with at least the following properties:
name the name of the method
signature the parameters of the method
returns the return type of the method
multi whether duplicate names are allowed
do the method body
The ".getmethods" method has a selector parameter that lets you specify
whether you want to see a flattened or hierarchical view, whether
you're interested in private methods, and so forth. If you want a
hierarchical view, you only get the methods actually defined in the
class proper. To get at the others, you follow the "isa" trait to find
your parent classes' methods, and you follow the "roles" trait to get
to role methods, and from parents or roles you may also find links to
further parents or roles.
The ".getattributes" method returns a list of attribute descriptors
that have traits like these:
name
type
scope
rw
private
accessor
build
Additionally they can have any other variable traits that can
reasonably be applied to object attributes, such as "constant".
Strictly speaking, metamethods like ".isa()", ".does()", and ".can()"
should be called through the meta object:
$obj.meta.can("bark")
$obj.meta.does(Dog)
$obj.meta.isa(Mammal)
And they can always be called that way. For convenience you can often
omit the ".meta" call because the base "Object" type translates any
unrecognized ".foo()" into ".meta.foo()" if the meta class has a method
of that name. But if a derived class overrides such a metamethod, you
have to go through the ".meta" call explicitly to get the original
call.
In previous Apocalypses we said that:
$obj ~~ Dog
calls:
$obj.isa(Dog)
That is not longer the case--you're actually calling:
$obj.meta.does(Dog)
which is true if $obj either "does" or "isa" "Dog" (or "isa" something
that "does" "Dog"). That is, it asks if $obj is likely to satisfy the
interface that comes from the "Dog" role or class. The ".isa" method,
by contrast, is strictly asking if $obj inherits from the "Dog" class.
It's erroneous to call it on a role. Well, okay, it's not strictly
erroneous. It will just never return true. The optimizer will love
you, and remove half your code.
Note that either of ".does" or ".isa" can lie, insofar as you might
include an interface that you later override parts of. When in doubt,
rely on ".can" instead. Better yet, rely on your dispatcher to pick
the right method without trying to second guess it. (And then be
prepared to catch the exception if the dispatcher throws up its hands
in disgust...)
By the way, unlike in Perl 5 where ".can" returns a single routine
reference, Perl 6's version of ".meta.can" returns a "WALK" iterator
for a set of routines that match the name. When dereferenced, the
iterator gets fed to a dispatcher as if the method had been called in
the first place. Note that any wildcard methods (via delegation or
"AUTOLOAD") are included by default in this list of potential handlers,
so there is no reason for subclasses to have to redefine ".can" to
reflect the new names. This does potentially weaken the meaning of
".can" from "definitely has a method of this name" to "definitely has
one or more methods in one or more classes that will try to handle
this." But that's probably closer to what you want, and the best we
can do when people start fooling around with wildcard methods under MI.
However, that being said, many classes may wish to dynamically specify
at the last moment which methods they can or cannot handle. That is,
they want a hook to allow a class to declare names even while the
".can" candidate list is being built. By default ".meta.can" includes
all wildcard delegations and autoloads at the end of the list.
However, it will exclude from the list of candidates any class that
defines its own "AUTOMETH" method, on the assumption that each such
"AUTOMETH" method has already had its chance to add any callable names
to the list. If the class's "AUTOMETH" wishes to supply a method, it
should return a reference to that method.
Do not confuse "AUTOMETH" with "AUTOMETHDEF". The former is equivalent
to declaring a stub declaration. The latter is equivalent to supplying
a body for an existing stub. Whether "AUTOMETH" actually creates a
stub, or "AUTOMETHDEF" actually creates a body, is entirely up to those
routines. If they wish to cache their results, of course, then they
should create the stub or body.
There are corresponding "AUTOSUB" and "AUTOSUBDEF" hooks. And
"AUTOVAR" and "AUTOVARDEF" hooks. These all pretty much make
"AUTOLOAD" obsolete. But "AUTOLOAD" is still there for old times's
sake.
Other Non-OO Decisions
A lot of time went by while I was in the hospital last year, so we
ended up polishing up the design of Perl 6 in a number of areas not
directly related to OO. Since I've already got your attention (and
we're already 90% of the way through this Apocalypse), I might as well
list these decisions here.
Exportation
The trait we'll use for exportation (typically from modules but also
from classes pretending to be modules) is "export":
# Tagset...
sub foo is export(:DEFAULT) {...} # :DEFAULT, :ALL
sub bar is export(:DEFAULT :others) {...} # :DEFAULT, :ALL, :others
sub baz is export(:MANDATORY) {...} # (always exported)
sub bop is export {...} # :ALL
sub qux is export(:others) {...} # :ALL, :others
Compared to Perl 5, we've basically made it easier to mark something as
exportable, but more difficult to export something by default. You no
longer have to declare your tagsets separately, since ":foo" parameters
are self-declaring, and the module will automatically build the tagsets
for you from the export trait arguments.
The gather/take Construct
We used one example of the conjectural gather/take construct. A gather
executes a closure, returning a list of all the values returned by
"take" within its lexical scope. In a lazy context it might run as a
coroutine. There probably ought to be a dynamically scoped variant.
Unless it should be dynamic by default, in which case there probably
ought to be a lexically scoped variant...
:foo() Adverbs
There's a new pair syntax that is more conducive to use as option
arguments. This syntax is reminiscent of both the Unix command line
syntax and the I/O layers syntax of Perl 5. But unlike Unix command-
line options, we use colon to introduce the option rather than the
overly negative minus sign. And unlike Perl 5's layers options, you
can use these outside of a string.
We haven't discarded the old pair syntax. It's still more readable for
certain uses, and it allows the key to be a non-identifier. Plus we
can define the new syntax in terms of it:
Old New
------
foo => $bar :foo($bar)
foo => [1,2,3,@many] :foo[1,2,3,@many]
foo => Xalice bob charlesX :fooXalice bob charlesX
foo => 'alice' :fooXaliceX
foo => { a => 1, b => 2 } :foo{ a => 1, b => 2 }
foo => { dostuff() } :foo{ dostuff() }
foo => 0 :foo(0)
foo => 1 :foo
It's that last one that's the real winner for passing boolean options.
One other nice thing is that if you have several options in a row you
don't have to put commas between:
$handle = open $file, :chomp :encodingXguessX :ungzip or die "Oops";
It might be argued that this conflicts the :foo notation for private
methods. I don't think it's a problem because method names never occur
in isolation.
Oh, one other feature of option pairs is that certain operations can
use them as adverbs. For instance, you often want to tell the range
operator how much to skip on each iteration. That looks like this:
1..100 :by(3)
Note that this only works where an operator is expected rather than a
term. So there's no confusion between:
randomlistop 1..100 :by(3)
and
randomlistop 1..100, :by(3)
In the latter case, the option is being passed to "randomlistop()"
rather than the "infix:.." operator.
[Update: That's "infix:<..>" now.]
Special Quoting of Identifiers Inside Curlies Going Away!
Novice Perl 5 programmers are continually getting trapped by subscripts
that autoquote unexpectedly. So in Perl 6, we'll remove that special
case. %hash{shift} now always calls the shift function, because the
inside of curlies is always an expression. Instead, if you want to
subscript a hash with a constant string, or a slice of constant
strings, use the new French qw//-ish brackets like this:
%hashXaliceX # same as %hash{'alice'}
%hashXalice bob charlieX # same as %hash{'alice','bob','charlie'}
Note in particular that, since slices in Perl 6 are determined by the
subscript only, not the sigil, this:
%hashXaliceX = @x;
evaluates the right side in scalar context, while
%hashXalice bob charlieX = @x;
evaluates the right side in list context. As with all other uses of
the French quotes in Perl 6, you can always use:
%hash<<alice>> = @x;
if you can't figure out how to type "^K < <" or "^K > >" in vim.
On the other hand, if you've got a fully Unicode aware editor, you
could probably write some macros to use the big double angles from
Asian languages:
%hashXaliceX = @x;
But by default we only provide the Latin-1 compatible versions. It
would be easy to overuse Unicode in Perl 6, so we're trying to underuse
Unicode for small values of 6. (Not to be confused with X, or X.)
[Update: We switched to bare angles for this. Double angles now do
shell-like interpolation, so double angles relate to single angles much
like double quotes relate to single quotes.]
Vector Operators Renamed Back to "hyper" Operators
The mathematicians got confused when we started talking about "vector"
operators, so these dimensionally dwimming versions of scalar operators
are now called hyper operators (again). Some folks see operations like
@a X*X @b
as totally useless, and maybe they are--to a mathematician. But to
someone simply trying to calculate a bunch of things in parallel (think
cellular automata, or aerodynamic simulations, for instance), they make
a lot of sense. And don't restrict your thinking to math operators.
How about appending a newline to every string before printing it out:
print @strings X~X "\n";
Of course,
for @strings {say}
is a shorter way to do the same thing. (""say"" is just Perl 6's
version of a printline function.)
Unary Hyper Operators Now Use One Quote Rather Than Two
Unary operators read better if they only "hyper" on the side where
there's an actual argument:
@neg = -X @pos;
@indexes = @x X++;
And in particular, I consider a method spec like ".bletch(1,2,3)" to be
a unary postfix operator, and it would be really ugly to say:
@objectsX.bletch(1,2,3)X
So that's just:
@objectsX.bletch(1,2,3)
In general, binary operators still take "hypers" on both sides,
indicating that both sides participate in the dwimmery.
@a X+X @a
To indicate that one side or the other should be evaluated as a scalar
before participating in the hyperoperator, you can always put in a
context specifier:
@a X+X +@a
"$thumb.twiddle" No Longer Requires Parens When Interpolated
In Apocalypse 2 we said that any method interpolated into a double-
quoted string has to have parentheses. We're throwing out that special
rule in the interests of consistency. Now if you want to interpolate a
variable followed by an "accidental" dot, use one of these:
$($var).twiddle
$var\.twiddle
Yes, that will make it a little harder to translate Perl 5 to Perl 6.
(Parentheses are still required if there are any arguments, however.)
[Update: We're back to requiring parens on methods. In fact, we've
gone the other way--we now require square brackets on arrays and
curlies on hashes. And a bare closure also interpolates. The only
interpolator that doesn't require some kind of bracketing terminator is
a simple scalar. See S2.]
The =:= Identity Operator
There is a new "=:=" identity operator, which tests to see if two
objects are the same object. The association with the ":=" binding
operator should be obvious. (Some classes such as integers may
consider all objects of the same value to be a single object, in a
Platonic sense.)
Hmm? No, there is no associated assignment operator. And if there
were, I wouldn't tell you about it. Sheesh, some people...
But there is, of course, a hyper version:
@a X=:=X @b
New Grammatical Categories
The current set of grammatical categories for operator names is:
Category Example of use
----------------------
coerce:as 123 as BigInt, BigInt(123)
self:sort @array.=sort
term:... $x = {...}
prefix:+ +$x
infix:+ $x + $y
postfix:++ $x++
circumfix:[] [ @x ]
postcircumfix:[] $x[$y] or $x .[$y]
rule_modifier:p5 m:p5//
trait_verb:handles has $.tail handles XwagX
trait_auxiliary:shall my $x shall conformXTR123X
scope_declarator:has has $.x;
statement_control:if if $condition {...} else {...}
infix_postfix_meta_operator:= $x += 2;
postfix_prefix_meta_operator:X @array X++
prefix_postfix_meta_operator:X -X @magnitudes
infix_circumfix_meta_operator:XX @a X+X @b
Now, you may be thinking that some of these have long, unwieldy names.
You'd be right. The longer the name, the longer you should think
before adding a new operator of that category. (And the length of time
you should think probably scales exponentially with the length of the
name.)
[Update: The actual operator name must be quoted like a hash subscript:
coerce:<as> 123 as BigInt, BigInt(123)
self:<sort> @array.=sort
term:<...> $x = {...}
prefix:<+> +$x
infix:<+> $x + $y
postfix:<++> $x++
circumfix:<[ ]> [ @x ]
postcircumfix:<[ ]> $x[$y] or $x .[$y]
rule_modifier:<p5> m:p5//
trait_verb:<handles> has $.tail handles <wag>
trait_auxiliary:<shall> my $x shall conform<TR123>
scope_declarator:<has> has $.x;
statement_control:<if> if $condition {...} else {...}
statement_modifier:<if> ... if $condition
infix_postfix_meta_operator:<=> $x += 2;
postfix_prefix_meta_operator:{'X'} @array X++
prefix_postfix_meta_operator:{'X'} -X @magnitudes
infix_circumfix_meta_operator:{'X','X'} @a X+X @b
Please note that the "hole" in circumfixes is now specified by slice
notation. There is no longer any special split-down-the-middle rule.]
Assignment to "state" variable declaration now does "first" semantics.
As we talked about earlier, assignment to a ""has"" variable is really
pseudo-assignment representing a call to the ""build"" trait. In the
same way, assignment to "state" variables (Perl's version of lexically
scoped "static" variables), is taken as pseudo-assignment representing
a call to the ""first"" trait. The first time through a piece of code
is when state variables typically like to be initialized. So saying:
state $pc = $startpc;
is equivalent to
state $pc is first( $startpc );
which means that it will pay attention to the $startpc variable only
the first time this block is ever executed. Note that any side effects
within the expression will only happen the first time through. If you
say
state $x = $y++;
then that statement will only ever increment $y once. If that's not
what you want, then use a real assignment as a separate statement:
state $x;
$x = $y++;
The ":=" and ".=" operators also attempt to do what you mean, which in
the case of:
state $x := $y++;
still probably doesn't do what you want. ":-)"
In general, any "preset" trait is smart about when to apply its value
to the container it's being applied to, such that the value is set
statically if that's possible, and if that's not possible, it is set
dynamically at the "correct" moment.
For ordinary assignment to a ""my"" variable, that correct moment just
happens to be every time it is executed, so "=" represents ordinary
assignment. If you want to force an initial value at execution time
that was calculated earlier, however, then just use ordinary assignment
to assign the results of a precalculated block:
my @canines = INIT { split slurp "%ENVXHOMEX/.canines" };
It's only the "has" and "state" declarators that redefine assignment to
set defaults with traits. (For "has", that's because the actual
attribute variable won't exist until the object is created. For
"state", that's because we want the default to be "first time
through".) But you can use any of the traits on any variable for which
it makes sense. For instance, just because we invented the ""first""
initializer for state variables:
state $lexstate is first(0);
doesn't mean you can't use it to initialize any variable only the first
time through a block of code:
my $foo is first(0);
However, it probably doesn't make a lot of sense on a ""my"" variable,
unless you really want it to be undefined the second time through. It
does make a little more sense on an ""our"" variable that will hang
onto its value like a state variable:
our $counter is first(0);
An assignment would often be wrong in this case. But generally, the
naive user can simply use assignment, and it will usually do what they
want (if occasionally more often than they want). But it does exactly
what they want on "has" and "state" variables--presuming they are savvy
enough to want what it actually does... ":-)"
So as with "has" variables, "state" variables can be initialized with
precomputed values:
state $x = BEGIN { calc() }
state $x = CHECK { calc() }
state $x = INIT { calc() }
state $x = FIRST { calc() }
state $x = ENTER { calc() }
which mean something like:
state $x is first( BEGIN { calc() } )
state $x is first( CHECK { calc() } )
state $x is first( INIT { calc() } )
state $x is first( FIRST { calc() } )
state $x is first( ENTER { calc() } )
Note, however, that the last one doesn't in fact make much sense, since
"ENTER" happens more frequently than "FIRST". Come to think of it,
doing "FIRST" inside a "first" doesn't buy you much either...
The length() function is gone
In Perl 6 you're not going to see
my $sizeofstring = length($string);
That's because "length" has been deemed to be an insufficiently
specified concept, because it doesn't specify the units. Instead, if
you want the length of something in characters you use
my $sizeinchars = chars($string);
and if you want the size in elements, you use
my $sizeinelems = elems(@array);
This is more orthogonal in some ways, insofar as you can now ask for
the size in chars of an array, and it will add up all the lengths of
the strings in it for you:
my $sizeinchars = chars(@array);
And if you ask for the number of elems of a scalar, it knows to
dereference it:
my $ref = [1,2,3];
my $sizeinelems = elems($ref);
These are, in fact, just generic object methods:
@array.elems
$string.chars
@array.chars
$ref.elems
And the functional forms are just multimethod calls. (Unless they're
indirect object calls...who knows?)
You can also use "%hash.elems", which returns the number of pairs in
the hash. I don't think "%hash.chars" is terribly useful, but it will
tell you how many characters total there are in the values. (The key
lengths are ignored, just like the integer "keys" of an ordinary
array.)
Actually, the meaning of ".chars" varies depending on your current
level of Unicode support. To be more specific, there's also:
$string.bytes
$string.codepoints
$string.graphemes
$string.letters
[Update: Those are shortened to ".codes", ".graphs", and ".langs" now.]
...none of which should be confused with:
$string.columns
or its evil twin:
$string.pixels
Those last two require knowledge of the current font and rendering
engine, in fact. Though ".columns" is likely to be pretty much the
same for most Unicode fonts that restrict themselves to single and
double-wide characters.
String positions
A corollary to the preceding is that string positions are not numbers.
If you say either
$pos = index($string, "foo");
or
$string ~~ /foo/; $pos = $string.pos;
then $pos points to that location in that string. If you ask for the
numeric value of $pos, you'll get a number, but which number you get
can vary depending on whether you're currently treating characters as
bytes, codepoints, graphemes, or letters. When you pass a $pos to
"substr($string, $pos, 3)", you'll get back ""foo"", but not because it
counted over some number of characters. If you use $pos on some other
string, then it has to interpret the value numerically in the current
view of what "character" means. In a boolean context, a position is
true if the position is defined, even if that position would evaluate
to 0 numerically. ("index" and "rindex" return undef when they "run
out".)
And, in fact, when you say "$len = .chars", you're really getting back
the position of the end of the string, which just happens to numerify
to the number of characters in the string in the current view. A
consequence of the preceding rules is that """.chars" is true, but
"+"".chars" is false. So Perl 5 code that says "length($string)" needs
to be translated to "+chars($string)" if used in a boolean context.
Routines like "substr" and "index" take either positions or integers
for arguments. Integers will automatically be turned into positions in
the current view. This may involve traversing the string for variable-
width representations, especially when working with combining
characters as parts of graphemes. Once you're working with abstract
positions, however, they are efficient. So
while $pos = index($string, "fido", $pos + 1) {...}
never has to rescan the string.
The other point of all this is that you can pass $pos or $len to
another module, and it doesn't matter if you're doing offsets in
graphemes and they are doing offsets in codepoints. They get the
correct position by their lights, even though the number of characters
looks different. The main constraint on this is that if you pass a
position from a lower Unicode support level to a higher Unicode support
level, you can end up with a position that is inside what you think of
as a unitary character, whether that's a byte within a codepoint, or a
codepoint within a grapheme or letter. If you deref such a position, an
exception is thrown. But generally high-level routines call into low-
level routines, so the issue shouldn't arise all that often in
practice. However, low-level routines that want to be called from
high-level routines should strive not to return positions inside high-
level characters--the fly in the ointment being that the low-level
routine doesn't necessarily know the Unicode level expected by the
calling routine. But we have a solution for that...
High-level routines that suspect they may have a "partial position" can
call "$pos.snap" (or "$pos.=snap") to round up to the next integral
position in the current view, or (much less commonly) "$pos.snapback"
(or "$pos.=snapback") to round down to the next integral position in
the current view. This only biases the position rightward or leftward.
It doesn't actually do any repositioning unless we're about to throw an
exception. So this allows the low-level routine to return "$pos.snap"
without knowing at the time how far forward to snap. The actual
snapping is done later when the high-level routine tries to use the
position, and at that point we know which semantics to snap forward
under.
By the way, if you bind to a position rather than assign, it tracks the
string in question:
my $string = "xyz";
my $endpos := $string.chars; # $endpos == 3
substr($string,0,0,"abc"); # $endpos == 6, $string = "abcxyz"
Deletions of string around a position cause the position to be reduced
to the beginning of the deletion. Insertions at a position are assumed
to be after that position. That is, the position stays pointing to the
beginning of the newly inserted string, like this:
my $string = "xyz";
my $endpos := $string.chars; # $endpos == 3
substr($string,2,1,"abc"); # $endpos == 2, $string = "xyabc"
Hence concatenation never updates any positions. Which means that
sometimes you just have to call ".chars" again... (Perhaps we'll
provide a way to optionally insert before any matching position.)
Note that positions try very hard not to get demoted to integers. In
particular, position objects overload addition and substraction such
that
$string.chars - 1
index($string, "foo") + 2
are still position objects with an implicit reference into the string.
(Subtracting one position object from another results in an integer,
however.)
The New "&" Separator in Regexen
Analogous to the disjunctional "|" separator, we're also putting in a
conjunctional "&" separator into our regex syntax:
"DOG" ~~ /D [ <vowel>+ & <upper>+ ] G/
The semantics of it are pretty straightforward, as long as you realize
that all of the ANDed assertions have to match with the same length.
More precisely, they have to start and stop matching at the same
location. So the following is always going to be false:
/ . & .. /
It would be possible to have the other semantics where, as long as the
trailing assertion matches either way, it doesn't have to match the
trailing assertion the same way. But then tell me whether $1 should
return ""O"" or ""G"" after this:
"DOG" ~~ /^[. & ..] (.)/
Besides, it's easy enough to get the other semantics with lookahead
assertions. Autoanchoring all the legs of a conjunction to the same
spot adds much more value to it by differentiating it from lookahead.
You have to work pretty hard to make separate lookaheads match the same
length. Plus doing that turns what should be a symmetric operator into
a non-symmetrical one, where the final lookahead can't be a lookahead
because someone has to "eat" the characters that all the assertions
have agreed on are the right number to eat. So for all these reasons
it's better to have a conjunction operator with complicated enough
start/stop semantics to be useful.
Actually, this operator was originally suggested to me by a biologist.
Which leads us to our...
Optional Mandatory Cross-disciplinary Joke for People Tired of Dogs
Biologist: What's worse than being chased by a Velociraptor?
Physicist: Obviously, being chased by an Acceloraptor.
Future Directions
Away from Acceloraptors, obviously.
References...er, Reference...
Nathanael Schaerli, Stephane Ducasse, Oscar Nierstrasz and Andrew
Black. Traits: Composable Units of Behavior. European Conference on
Object-Oriented Programming (ECOOP), July 2003. Springer LNCS 2743,
Ed. Luca Cardelli.
perl v5.14.0 2006-02-28 Perl6::Bible::A12(3)