måndag, juli 16, 2007

Closing over ZSuper

One of the features of Ruby which I sometimes like and sometimes hate, is ZSuper. (So called, because it differs from regular super in the AST.) ZSuper is the keyword super, with arguments and parenthesis, which will call the super method with the same arguments as the current invocation got. Of course, that's not all. For example, if you change the arguments, the changes will propagate to the super implementation. Not only if you change the object, but if you change the reference, which I found non intuitive the first time I found it.

That's all and well. The interesting thing happens when you close over the super call and return it as a Proc. I haven't seen anyone doing this, which I guess is why there seems to be a bug in the implementation. Look at this code and tell me what it prints:
class Base
def foo(*args)
p [:Base, :foo, *args]
end
end

class Sub < Base
def foo(first, *args)
super
first = "changed"
super
proc { |*args| super }
end
end

Sub.new.foo("initial", "try", :four).call("args","to","block")
Notice that Base#foo will get called three times during this code. In Sub#foo we are changing the first argument to the new string "changed". As I told you before, the second super call will actually get "changed" as the first argument the second time. But what will happen after that? We first create a block that uses ZSuper. We send the block to proc, reifying the block into an instance of Proc, and returning that. Directly after returning the block, we call it with some arguments. Now, the way I expect this to work (and incidentally, that's the way JRuby works) is that the output should be something like this:
[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", "try", :four]
We see that the first argument changed from "initial" to "changed", but otherwise the result is the same; the closure is a real closure over everything in the frame and scope. I guess you've realized that the same isn't true for Ruby. Without further ado, this is the output from MRI 1.8.6:
[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", ["args", "to", "block"], false]
The first time I saw this, the words WTF passed through my mind. In fact, that still happens sometimes. What is happening here? Well, obviously, it seems as if the passing of arguments to the block somehow clobbers the part where MRI saves away the closure over passed arguments. I have no idea whatsoever what the false value comes from. Hmm. But now that I think about it (this is just a guess), but I believe it stands for the fact that the arguments should be splatted into one argument. (That's the one called args in the block). If it had been true, they should refer to different variables. I think there is some trickery like that involved in the splatting logic in MRI.

Anyway. Is this a bug or a feature? I can't see any way it could be used in an obvious way, and it runs counter to being understandable and unsurprising. Anyone who can give me a good example of where this is useful behavior?

5 kommentarer:

Jorge L. Cangas sa...

So is a bug. I tried your code and get:
"[BUG] Segmentation fault"
Now you see it is a bug ;).

I'm runnig ruby 1.8.6 (2007-03-13) [i386-mswin32] on a WXP

Anonym sa...

I would call it a bug since this code won't stand on its own:

lambda {super}.call

In many bindings/contexts this will result in something like:

super called outside of method (NoMethodError).

Like you say, it is far more intuitive to attach super's behavior to a method similar to the way yield works.

sjs sa...

I'd say it's a bug that the splatted args are not splatted, and the presence of the trailing value is also odd.

But considering that Ruby reuses variables of the same name in the outer scope inside a block I think that using the passed in *args is expected. I was surprised to see the behaviour you expected, so this shows how each person's POLS is different.

sjs@tuono% uname -a
Linux tuono 2.6.20-gentoo-r8 #2 SMP Wed Jul 4 12:26:08 PDT 2007 x86_64 AMD Opteron(tm) 275 AuthenticAMD GNU/Linux
sjs@tuono% ruby -v
ruby 1.8.5 (2006-12-04 patchlevel 2) [x86_64-linux]
...
irb(main):016:0* Sub.new.foo("initial", "try", :four).call("args","to","block")
[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", ["args", "to", "block"], 16]

Anonym sa...

Hm.

powerbook% uname -a
Darwin powerbook.local 8.10.0 Darwin Kernel Version 8.10.0: Wed May 23 16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC Power Macintosh powerpc

powerbook% ruby -v
ruby 1.8.5 (2006-12-25 patchlevel 12) [powerpc-darwin8.8.0]

For what it's worth, I changed one little thing in your code, that is the proc { |*args| super } i changed to proc { |*blahs| super }, and it produced this:

powerbook% ruby stuff.rb
[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", ["try", :four], false]

That false is still there, but it's closer to what you expected to see. I think the difference has to do with when the closure is created -- in the first case it seems like the super called is scoped on the args inside the proc (and passed to the proc call) and in the second it seems closed to the args in Sub's version of foo.

Anonym sa...

Doesn't this have to do with the way Ruby scopes work? ie:

foo = 10
bar = lambda {|foo| foo}
bar[20]
p foo

which will print 20 because reusing a name in the argument list of a block reuses the variable in the block's outer scope, so your block's argument is an alias to the method's argument list, so calling the block modifies the closed over argument list.

I believe it's due to be fixed in 1.9, but right now it's working as designed.