PERL PERL6 INTERNALS 9 RETURNING VARYING NUMBERS OF RESULTS FROM A TAIL CALL
Date: Sat, 19 Feb 2005 17:28:42 -0500

Subject: Returning varying numbers of results from a tail call
From: rogers-perl6@no-spam (Bob Rogers)

In situations where A calls B and B tail-calls C, and C produces some arbitrary number of return values, I would like to be able to generate code for B without having to care how many values A expects, how many C produces, or even whether these numbers are fixed at compile-time. I can sort of do this now via tail-call optimization, e.g.:

.sub _B non_prototyped .param pmc function .local pmc argv argv = foldup 1

print "[doing _B]\n"
$I33 = defined function if $I33 goto doit bad_func:
printerr "_B: Bad function.\n"
die doit:
.pcc_begin prototyped .flatten_arg argv .pcc_call function .pcc_end .pcc_begin_return .pcc_end_return .end
This works, but I have to jump through numerous hoops:

1. I must resort to this lame idiom that appears to accept and return no values.

2. The code must be compiled with "-Oc", or that is in fact what it does.

3. I must live with the fact that I won't ever see B in a backtrace;
I get tail-merging semantics whether I like it or not.

4. The return must be just before .end; it doesn't work if I invert the sense of the test and swap the "bad_func" and "doit" blocks. (Is this a bug in tail-call optimization?)

The number of hoops is surprising, given that perl5 has this "call and return all values" semantics. (But perl6 seems to return everything in an array, so that was no help. Whatever.)

The key problem is that this code works only if compiled with tail-merging (tail-call optimization) enabled, which strikes me as wrong; correct code should not have to depend on an optimization,
certainly not a non-default one.

So I would like to be able to tell IMCC explicitly that I am doing a tail call which should pass all returned values back to my caller. This should be regardless of whether tail-merging is enabled, but should definitely be tail-merged if so.

I thought I might be able to do this via .pcc_call with an explicit continuation, but I don't see how to get to the continuation that returncc would use. Is this possible?

Better still would be an explicit way to say "call and return all values." Here's one possible syntax for the example above:

doit:
.pcc_begin prototyped .flatten_arg argv .pcc_tail_call function .pcc_end .end
The ".pcc_tail_call function" would be an alternative to the "pcc_call opt_label pcc_results" sequence in the "pcc_sub_call" production. It seems to me that this could generate a normal call followed by returncc (without changing I0-4) if tail-call optimization is off.

But then, I'm pretty wet behind the ears when it comes to hacking Parrot, so I couldn't produce a patch without help, or at least some direction. Not to mention the possibility that there may be something I've missed . . .

TIA,

-- Bob Rogers http://rgrjr.dyndns.org/


Subject: Re: Returning varying numbers of results from a tail call
Date: Mon, 21 Feb 2005 11:40:49 +0100

From: lt@no-spam (Leopold Toetsch)
Bob Rogers <rogers-perl6@no-spam> wrote:
> In situations where A calls B and B tail-calls C, and C produces some > arbitrary number of return values, I would like to be able to generate > code for B without having to care how many values A expects, how many C > produces, or even whether these numbers are fixed at compile-time.

I'd pass one argument array around. With C<foldup/.flatten_arg) you are doing that anyway.

> ... I > can sort of do this now via tail-call optimization, e.g.:

Well, when return counts differ, it's not supposed to be optimzable.
Your code is faking zero return values.

> 1. I must resort to this lame idiom that appears to accept and > return no values.

Yep.

> 2. The code must be compiled with "-Oc", or that is in fact what it > does.

Automatic tailcall code generation is considered as an optimization, so it's not done per default.

> 3. I must live with the fact that I won't ever see B in a backtrace;
> I get tail-merging semantics whether I like it or not.

Yes. C is reusing the call frame of B in that case.

> 4. The return must be just before .end; it doesn't work if I invert > the sense of the test and swap the "bad_func" and "doit" blocks. (Is > this a bug in tail-call optimization?)

You could call it a bug, yes. This is the only case, where tailcall optimization is done. But you can always branch to the end of the function and have just one return statement at the very end.

> The key problem is that this code works only if compiled with > tail-merging (tail-call optimization) enabled, which strikes me as > wrong; correct code should not have to depend on an optimization,
> certainly not a non-default one.

Your code is lying about the count of returned values. You can't expect that to work.

> So I would like to be able to tell IMCC explicitly that I am doing a > tail call which should pass all returned values back to my caller. This > should be regardless of whether tail-merging is enabled, but should > definitely be tail-merged if so.

As said, I'd just pass exactly one array around.

> I thought I might be able to do this via .pcc_call with an explicit > continuation, but I don't see how to get to the continuation that > returncc would use. Is this possible?

include "interpinfo.pasm"
the_cont = interpinfo .INTERPINFO_CURRENT_CONT
> Better still would be an explicit way to say "call and return all > values." Here's one possible syntax for the example above:

> doit:
> .pcc_begin prototyped > .flatten_arg argv > .pcc_tail_call function > .pcc_end > .end
> The ".pcc_tail_call function" would be an alternative to the "pcc_call > opt_label pcc_results" sequence in the "pcc_sub_call" production. It > seems to me that this could generate a normal call followed by returncc > (without changing I0-4) if tail-call optimization is off.

You could write in in PASM.

But yep. Such a construct should be there. And additionally maybe
.tail_call (rets) = func(...)

> But then, I'm pretty wet behind the ears when it comes to hacking > Parrot, so I couldn't produce a patch without help, or at least some > direction.

The lexer and parser parts should be quite simple:

1) perl Configure.pl --maintainer # ... your options
Enable flex/bison
2) hack imcc/imcc.l and imcc/imcc.y
The parser can reuse the existing function call rules, it probably has just to set a TAIL_CALL bit in the pcc_sub structure.

3) grep for C<tail_call> in imcc/pbc.c for the code generation part.

> TIA,

Hope that helps and welcome,

leo

Date: Mon, 21 Feb 2005 13:45:29 -0500

Subject: Re: Returning varying numbers of results from a tail call
From: rogers-perl6@no-spam (Bob Rogers)
From: Leopold Toetsch <lt@no-spam>
Date: Mon, 21 Feb 2005 11:40:49 +0100

Bob Rogers <rogers-perl6@no-spam> wrote:
> In situations where A calls B and B tail-calls C, and C produces some > arbitrary number of return values, I would like to be able to generate > code for B without having to care how many values A expects, how many C > produces, or even whether these numbers are fixed at compile-time.

I'd pass one argument array around. With C<foldup/.flatten_arg) you are doing that anyway.

Well, the example I showed you was a bit contrived. I'm working on implementing Common Lisp call/return semantics, where functions often take variable numbers of arguments, and can return zero or more values,
and it's the caller's responsibility to sort out what it got back. In fact, CL multiple return values are quite similar to optional arguments by design. Furthermore, there are numerous places in the spec where it says that "expression X returns all the values of its final subform" --
hence my interest in tail calling. (Just like Perl5, as a matter of fact, but without context propagation.)

So the Parrot calling convention seemed marvelously well suited to all that, especially given the symmetry between call and return, and it would be a shame to throw that all away by passing/returning arrays,
never mind the extra overhead.

. . .

> 4. The return must be just before .end; it doesn't work if I invert > the sense of the test and swap the "bad_func" and "doit" blocks. (Is > this a bug in tail-call optimization?)

You could call it a bug, yes. This is the only case, where tailcall optimization is done.

It had seemed otherwise, looking at imcc/pcc.c yesterday, but I must not have been looking closely enough. Is that what the "&& !tmp->next" in line 583 does (CVS 1.86)? Is there a rationale behind limiting it this way?

But you can always branch to the end of the function and have just one return statement at the very end.

That would be awkward to do in general. The default assumption in Common Lisp is to pass back all values of a call that is in tail position (whether or not tail-call optimization is done). So any function of any size is likely to have numerous such calls, each potentially calling different functions with different arguments, and having different numbers of return values.

> The key problem is that this code works only if compiled with > tail-merging (tail-call optimization) enabled, which strikes me as > wrong; correct code should not have to depend on an optimization,
> certainly not a non-default one.

Your code is lying about the count of returned values. You can't expect that to work.

Exactly; I'd much rather emit "honest" code. ;-}

> So I would like to be able to tell IMCC explicitly that I am doing a > tail call which should pass all returned values back to my caller. This > should be regardless of whether tail-merging is enabled, but should > definitely be tail-merged if so.

As said, I'd just pass exactly one array around.

If this is going to be the dominant idiom for compilers targeting Parrot, then I may need to get on the bandwagon some day, if only for the sake of interoperability. Still, it seems like a waste of such an elegant mechanism . . .

> I thought I might be able to do this via .pcc_call with an explicit > continuation, but I don't see how to get to the continuation that > returncc would use. Is this possible?

include "interpinfo.pasm"
the_cont = interpinfo .INTERPINFO_CURRENT_CONT
Great; thank you.

> Better still would be an explicit way to say "call and return all > values." Here's one possible syntax for the example above:

> doit:
> .pcc_begin prototyped > .flatten_arg argv > .pcc_tail_call function > .pcc_end > .end
> The ".pcc_tail_call function" would be an alternative to the "pcc_call > opt_label pcc_results" sequence in the "pcc_sub_call" production. It > seems to me that this could generate a normal call followed by returncc > (without changing I0-4) if tail-call optimization is off.

You could write in in PASM.

I was hoping to avoid that; like I say, I'll need it frequently . . .

But yep. Such a construct should be there. And additionally maybe
.tail_call (rets) = func(...)

Actually, if that *is* a tail call, then you can't possibly be interested in "(rets)". But this syntax is clearly much easier to use for cases that don't require .flatten_arg or other .pcc_begin/.pcc_end magic. So how about
.tail_call func(...)

for something that *must* use the tailcall op, and
.return_call func(...)

as a shorthand for other calls in tail position that return all values?
Having two separate constructions helps to avoid confusion between required semantics and permitted optimization. For consistency, there must be two .pcc_* constructs: .pcc_tail_call and .pcc_return_call,
with corresponding semantics.

The extra syntax also provides detailed control over tail-call optimization. If you restrict the optimization only to .tail_call and .pcc_tail_call constructs, then the upstream compiler can specify whether tail-calling may, must, or must not be done, for each call individually. (And check_tail_call won't have to work so hard.)

. . .

Hope that helps and welcome,

leo
It does indeed; thanks particularly for the design help. I don't know when (or even if) I'll be able to get around to it -- C is not my strong suit -- but this has the feel of The Right Thing, so there's hope.

-- Bob

Subject: Re: Returning varying numbers of results from a tail call
Date: Tue, 22 Feb 2005 10:40:40 +0100

From: lt@no-spam (Leopold Toetsch)
Bob Rogers <rogers-perl6@no-spam> wrote:
> From: Leopold Toetsch <lt@no-spam>

> I'd pass one argument array around. With C<foldup/.flatten_arg) you are > doing that anyway.

> Well, the example I showed you was a bit contrived. I'm working on > implementing Common Lisp call/return semantics, where functions often > take variable numbers of arguments, and can return zero or more values,
> and it's the caller's responsibility to sort out what it got back. In > fact, CL multiple return values are quite similar to optional arguments > by design. Furthermore, there are numerous places in the spec where it > says that "expression X returns all the values of its final subform" --
> hence my interest in tail calling. (Just like Perl5, as a matter of > fact, but without context propagation.)

I see.

> It had seemed otherwise, looking at imcc/pcc.c yesterday, but I must not > have been looking closely enough. Is that what the "&& !tmp->next" in > line 583 does (CVS 1.86)? Is there a rationale behind limiting it this > way?

Yep - No, I don't see a rationale behind it. Just remove the test and try it ;)

> But yep. Such a construct should be there. And additionally maybe
> .tail_call (rets) = func(...)

> Actually, if that *is* a tail call, then you can't possibly be > interested in "(rets)".

Yep. I've just used the full call syntax, because there's a parser rule for it.

> ... But this syntax is clearly much easier to use > for cases that don't require .flatten_arg or other .pcc_begin/.pcc_end > magic. So how about
> .tail_call func(...)

Ok.

> for something that *must* use the tailcall op, and
> .return_call func(...)

> as a shorthand for other calls in tail position that return all values?
> Having two separate constructions helps to avoid confusion between > required semantics and permitted optimization.

I don't see much need for the latter. If the compiler wants tail calls it just emits the former. If not, the plain call syntax does it, e.g. for debugging.

> ... For consistency, there > must be two .pcc_* constructs: .pcc_tail_call and .pcc_return_call,
> with corresponding semantics.

Just .pcc_tail_call.

> The extra syntax also provides detailed control over tail-call > optimization.

I'd say: .tail_call always does a tail call. The check_tail_call() can be dropped then.

leo

Date: Sat, 26 Feb 2005 13:13:58 -0500

Subject: PATCH: Make check_tail_call work in two more situations.
From: rogers-perl6@no-spam (Bob Rogers)
From: Leopold Toetsch <lt@no-spam>
Date: Tue, 22 Feb 2005 10:40:40 +0100

Bob Rogers <rogers-perl6@no-spam> wrote:
> From: Leopold Toetsch <lt@no-spam>

> It had seemed otherwise, looking at imcc/pcc.c yesterday, but I must not > have been looking closely enough. Is that what the "&& !tmp->next" in > line 583 does (CVS 1.86)? Is there a rationale behind limiting it this > way?

Yep - No, I don't see a rationale behind it. Just remove the test and try it ;)

It works, with no apparent change to other tests (because none of them specify "-Oc", no doubt).

. . .

> ... For consistency, there > must be two .pcc_* constructs: .pcc_tail_call and .pcc_return_call,
> with corresponding semantics.

Just .pcc_tail_call.

> The extra syntax also provides detailed control over tail-call > optimization.

I'd say: .tail_call always does a tail call. The check_tail_call() can be dropped then.

leo
That takes IMCC out of the loop when it comes to tail-call optimization.
But IMCC seems like the natural place for doing these low-level optimizations . . .

In any case, the patch in the first attachment below does three things:

1. Removes the arbitrary check_tail_call() restriction to the final position.

2. Fixes a bug in testing for the implicit "null I0; null I3;
returncc" sequence -- it's "returncc", not "invoke". Those instructions are still emitted after the "tailcall"; I wasn't sure how to remove them without leaking memory.

3. Additionally, fixes a subtle typo in ABI_CHANGES (look closely!)

And the test file in the second attachment adds three tests to cover 1 and 2, albeit with calls in "lying to IMCC" style. I put this in imcc/t/syn/tail.t, but there may be a better place.

Also, I notice that none of this tailcall optimization behavior is documented . . . is there an appropriate place for it?

So now I'm motivated to "do it right." My contribution to the Parrot code so far is the deletion of five tokens, and the replacement of one other -- and several posts explaining why this isn't adequate. I don't dare leave it at just that. ;-}

-- Bob Rogers http://rgrjr.dyndns.org/

Index: ABI_CHANGES ===================================================================
RCS file: /cvs/public/parrot/ABI_CHANGES,v retrieving revision 1.2
diff -u -r1.2 ABI_CHANGES --- ABI_CHANGES 27 Nov 2004 13:19:21 -0000 1.2
+++ ABI_CHANGES 26 Feb 2005 17:27:15 -0000
@@no-spam -41,4 +41,4 @@no-spam 2004.11.15 leo The variables P0, P1, P2, S0 aren't visible in the called subroutine -anymore. See docs/ppds/pdd03_calling_conventions.pod.
+anymore. See docs/pdds/pdd03_calling_conventions.pod.
Index: imcc/pcc.c ===================================================================
RCS file: /cvs/public/parrot/imcc/pcc.c,v retrieving revision 1.86
diff -u -r1.86 pcc.c --- imcc/pcc.c 15 Jan 2005 21:04:58 -0000 1.86
+++ imcc/pcc.c 26 Feb 2005 17:27:18 -0000
@@no-spam -580,8 +580,8 @@no-spam if (!tmp)
return 0;
if (tmp->opnum == -1 && (tmp->type & ITPCCSUB) &&
- (tmp->type & ITLABEL) && !tmp->next) {
- ret_ins = tmp;
+ (tmp->type & ITLABEL)) {
+ ret_ins = tmp;
IMCC_debug(interp, DEBUG_OPT1, "check tail call %I \n", ins);
}
/*
@@no-spam -605,7 +605,7 @@no-spam }
else return 0;
- if (strcmp(tmp->op, "invoke"))
+ if (strcmp(tmp->op, "returncc"))
return 0;
IMCC_debug(interp, DEBUG_OPT1, "check tail call %I \n", tmp);
nrets = 0;

[plaintext check-tail-call.patch]
#!perl use strict;
use TestCompiler tests => 3;

##############################
# Parrot Calling Conventions: Tail call optimization.

$ENV{TEST_PROG_ARGS} = '-Oc';

output_is(<<'CODE', <<'OUT', "tail call optimization, final position");

.sub _main @no-spam
$P1 = new PerlInt $P1 = 20
$P2 = new PerlInt $P2 = 3
newsub $P99, .Sub, _floor ($P3, $P4) = _funcall($P99, $P1, $P2)
print "_floor returned "
print argcP print " values, "
print $P3
print " and "
print $P4
print ".\n"
newsub $P98, .Sub, _fib_step ($P3, $P4, $P5) = _funcall($P98, $P1, $P2)
print "_fib_step returned "
print argcP print " values, "
print $P3
print ", "
print $P4
print ", and "
print $P5
print ".\n"
.end
.sub _funcall non_prototyped .param pmc function .local pmc argv argv = foldup 1

print "[doing _funcall]\n"
$I33 = defined function if $I33 goto doit bad_func:
printerr "_funcall: Bad function.\n"
die doit:
.pcc_begin prototyped .flatten_arg argv .pcc_call function .pcc_end .pcc_begin_return .pcc_end_return .end
## Return quotient and remainder as two integers.
.sub _floor .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 / arg2
## truncate.
$I1 = $P1
$P1 = $I1
$P2 = new PerlInt $P2 = arg1 % arg2
.pcc_begin_return .return $P1
.return $P2
.pcc_end_return .end
## Return the sum and the two arguments as three integers.
.sub _fib_step .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 + arg2
.pcc_begin_return .return $P1
.return arg1
.return arg2
.pcc_end_return .end CODE [doing _funcall]
_floor returned 2 values, 6 and 2.
[doing _funcall]
_fib_step returned 3 values, 23, 20, and 3.
OUT
output_is(<<'CODE', <<'OUT', "tail call optimization, intermediate position");

.sub _main @no-spam
$P1 = new PerlInt $P1 = 20
$P2 = new PerlInt $P2 = 3
newsub $P99, .Sub, _floor ($P3, $P4) = _funcall($P99, $P1, $P2)
print "_floor returned "
print argcP print " values, "
print $P3
print " and "
print $P4
print ".\n"
newsub $P98, .Sub, _fib_step ($P3, $P4, $P5) = _funcall($P98, $P1, $P2)
print "_fib_step returned "
print argcP print " values, "
print $P3
print ", "
print $P4
print ", and "
print $P5
print ".\n"
.end
.sub _funcall non_prototyped .param pmc function .local pmc argv argv = foldup 1

print "[doing _funcall]\n"
$I33 = defined function unless $I33 goto bad_func doit:
.pcc_begin prototyped .flatten_arg argv .pcc_call function .pcc_end .pcc_begin_return .pcc_end_return bad_func:
printerr "_funcall: Bad function.\n"
die .end
## Return quotient and remainder as two integers.
.sub _floor .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 / arg2
## truncate.
$I1 = $P1
$P1 = $I1
$P2 = new PerlInt $P2 = arg1 % arg2
.pcc_begin_return .return $P1
.return $P2
.pcc_end_return .end
## Return the sum and the two arguments as three integers.
.sub _fib_step .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 + arg2
.pcc_begin_return .return $P1
.return arg1
.return arg2
.pcc_end_return .end CODE [doing _funcall]
_floor returned 2 values, 6 and 2.
[doing _funcall]
_fib_step returned 3 values, 23, 20, and 3.
OUT
output_is(<<'CODE', <<'OUT', "tail call optimization, implicit final return");

.sub _main @no-spam
$P1 = new PerlInt $P1 = 20
$P2 = new PerlInt $P2 = 3
newsub $P99, .Sub, _floor ($P3, $P4) = _funcall($P99, $P1, $P2)
print "_floor returned "
print argcP print " values, "
print $P3
print " and "
print $P4
print ".\n"
newsub $P98, .Sub, _fib_step ($P3, $P4, $P5) = _funcall($P98, $P1, $P2)
print "_fib_step returned "
print argcP print " values, "
print $P3
print ", "
print $P4
print ", and "
print $P5
print ".\n"
.end
.sub _funcall non_prototyped .param pmc function .local pmc argv argv = foldup 1

print "[doing _funcall]\n"
$I33 = defined function if $I33 goto doit bad_func:
printerr "_funcall: Bad function.\n"
die doit:
.pcc_begin prototyped .flatten_arg argv .pcc_call function .pcc_end .end
## Return quotient and remainder as two integers.
.sub _floor .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 / arg2
## truncate.
$I1 = $P1
$P1 = $I1
$P2 = new PerlInt $P2 = arg1 % arg2
.pcc_begin_return .return $P1
.return $P2
.pcc_end_return .end
## Return the sum and the two arguments as three integers.
.sub _fib_step .param pmc arg1
.param pmc arg2

$P1 = new PerlInt $P1 = arg1 + arg2
.pcc_begin_return .return $P1
.return arg1
.return arg2
.pcc_end_return .end CODE [doing _funcall]
_floor returned 2 values, 6 and 2.
[doing _funcall]
_fib_step returned 3 values, 23, 20, and 3.
OUT
[plaintext tail.t]


Subject: Re: PATCH: Make check_tail_call work in two more situations.
Date: Sun, 27 Feb 2005 17:14:20 +0100

From: lt@no-spam (Leopold Toetsch)
Bob Rogers <rogers-perl6@no-spam> wrote:

> I'd say: .tail_call always does a tail call. The check_tail_call() can > be dropped then.

> leo
> That takes IMCC out of the loop when it comes to tail-call optimization.
> But IMCC seems like the natural place for doing these low-level > optimizations . . .

Ok, lets leave that as is. Anyway, if a C<.tail_call> is already there it should be one.

> In any case, the patch in the first attachment below does three > things:

Thanks, applied.
leo