First, conversion from Mixins to Classes. The original Python implementation used multiple inheritance, created several base classes (Reader, Scanner, Parser, etc) and then created one several versions of a Loader class which inherited from the different base classes. My first implementation mirrored this approach, but used Modules instead of base classes and mixed in different versions of these in the different Loader classes. This approach was quite limiting since mixing in code into other Modules doesn't really work as you expect, and this is no substitute for subclassing. For example, I had a BaseResolver module, a SafeResolver module which mixed in BaseResolver and added code of it's own, but this were quite cumbersome.
The solution to this was simply to convert all Modules to class, and make all calls to the other tiers explicit. For example, instead of having the Parser module just assume that you've mixed in a Scanner and call check_token on itself, I have the Parser class take a Scanner instance at initialization and call check_token on this instance instead.
This works very well, and probably makes the code easier to understand. Another positive of this is that the interface between the layers are more apparent. For inclusion in JRuby, this will make it easier to replace certain parts with Java implementations.
The next piece on the agenda was a rewrite of the Parser. The original Python implementation used Python generators (which are almost like coroutines, but not quite). My first port of this code just parsed the whole stream, saved all events and then passed these on after parsing. This was good enough for smaller YAML documents, but when trying to parse the RubyGems gemspec, the memory and time requirements became to prohibitive. In the course of making the generator algorithm explicit I totally rewrote the Parser from the beginning, making it hybrid table driven instead of recursive-descent as the original was. I actually believe the new Parser is both easier to understand and faster. Just as an example, this is the code for block_sequence:
@parse_stack += [:block_sequence_end, :block_sequence_entry, :block_sequence_start]
where @parse_stack contains the next productions to call after block_sequence has finished. The main generator method just keeps calling the next production until it arrives to a terminal, and then returns the value of this:
meth = @parse_stack.pop
val = send(meth)
Another benefit of this is that this code is dead simple to port to other languages, once again probably easier than the Python version.
The third improvement was performance. I have no trustworthy numbers of the improvement, but it's in the order of 5-8 times faster than from the beginning. I achieved by some easy fixes, and some harder ones. I removed the Reader class and inlined those methods into the Scanner. I tested each case where I tested if a character was part of a String and checked were a Regexp was faster. And added some hard coded, unrolled loops in the most intense parts of the code, which was peek(), forward(), prefix() and update(). Every microsecond improvement in these methods counted since they are called so many times. I didn't do all this work blind, though. The Ruby profiler is really good. Just take a script, run it with ruby -rprofile script.rb and you get output that's incredibly good. I tested most of my changes this way, and the end result is about as fast as the JRuby RACC-based YAML parser, which was my goal.
Since version 0.1 I've spent some time getting JRuby to work flawlessly with RubyGems, and this work have uncovered some small bugs in RbYAML (and in SYCK, for that matter), so a new minor release will probably come soon. Until then the CVS is up to date.