I've been spending some time trying to implement a compiler for the defined?-feature of Ruby. If you haven't seen it, be happy. It's quite annoying, and incredibly complicated to implement, since you basically need to create a small interpreter especially just for nodes existing within defined?. So why is defined? so important? Well, for one it's actually needed to implement the construct ||= correctly. And that is used everywhere, which means that not compiling it will severely impact our ability to compile code. Also, it just so happens that OpAsgnOrNode (as it's called), and EnsureNode, are the two nodes left to implement to be able to compile Test::Unit assert-methods, since the internal _wrap_assertion uses both ensure and ||=.
So, now you know why. Next, a quick intro to the compilation strategy of JRuby. Basically we try to compile each script and each method into one Java method. We try to use the stack as much as possible, since we in that way can link statements together correctly. And that's about it.
The problem enters when you need to handle exceptions in the emitted Java bytecode. This isn't a problem in the interpreter, since we explicitly return a value for each node, and the interpreter doesn't use the Java stack as much as the compiler does. We also want to be able to use finally blocks at places, especially to ensure that ensure can be compiled down, but also to make the implementation of defined? safe.
So what's the problem? Can't we just emit the catch-table and so on correctly? Well, yes, we can do that. But it doesn't work. Because of a very annoying feature of the JVM. Namely, when a catch-block is entered, the stack gets blown away. Completely. So if the Ruby code is in the middle of a long chained statement, everything will disappear. And what's worse, this will actually fail to load with a Verifier exception, saying "inconsistent stack height", since there will now be one code path with things on the stack, and one code path with no values on the stack, and the way JRuby works, these will end up at the same point later on. And the JVM doesn't allow that either.
This makes it incredibly hard to handle these constructs in bytecode, and frankly, right now I have no idea how to do it. My first approach was to actually create a new method for each try-catch or try-finally, and just have the code in there instead. The fine thing about that is that the surrounding stack will not be blown away since it's part of the invoking method, and not in the current activation frame. And that approach actually works fairly well. Until you want to refer to values from outside from the try or catch block. Then it breaks down.
So, right now I don't know what to do. We have no way of knowing at any specific place how low the stack is, so it's not possible to copy it somewhere, and then restore it in the catch block. That would be totally inefficient too. In fact, I have no idea how other implementations handle this. There's gotta be a trick to it.