tisdag, maj 16, 2006

Translating Python to Ruby (the YAML case study)

. sI am not an experienced Python programmer. When I decided to port PyYAML3000 to Ruby I wasn't sure exactly what kind of problems I would run into. The stuff I found out when trying to get more or less the same semantics in Ruby as in Python was actually quite interesting and I have the notion that some of it may be of interest. At least I will get my thoughts on the translation straight.

The first phase of RbYAML was supposed to be a straight forward port of PyYAML from Python to Ruby. This ideal was of course impossible since some things really didn't work the same way in the two languages, but mostly the semantics translated cleanly. The biggest change in architecture between the implementations was the Unicode support - which I had to remove for now - and the inheritance structure. PyYAML uses the Python version of multiple inheritance to provide basic combination of functionality. Since Ruby doesn't have MI I implemented this using mixins instead. This is probably the first thing that I will change for phase two of RbYAML since mixins (and modules) just doesn't work the same way as inheritance. Right now I'm thinking that I'll probably use the Composite pattern where Loader and Dumper export the interface that's available, and all other operations will have to be done explicitly with the separate components.

The next thing that caused some trouble was Python's notion of truth and false. In Ruby, only the values nil and false is considered false, everything else is true. In Python a whole range of values are considered false: False, None, 0, the empty string, the empty list and the empty dict.

Another big stumbling block for implementing the same architecture in Ruby as in Python was the heavy use of coroutines
in a few parts of PyYAML. Right now I implemented this by just doing a really inefficient version that collects all nodes
before sending them away. I could of course have implemented a coroutine engine with call/cc, but since JRuby doesn't support
continuations yet, this would have defeated the purpose of the project. My plan for the next phase is to reorganize the code so that I
can use Ruby's standard idioms for iterators, which should improve both performance and memory usage drastically.

Some of the regular expressions wasn't really compatible with each other, but this was pretty easy to fix.

The final trouble was Python's use of requiring () for invocation of a method, and if a method is referenced without parenthesis, this means
that the symbol for the method is referenced instead. This was mostly a problem because of my unfamiliarity with Python.

So, in conclusion, porting Python to Ruby is quite easy, except for a few small areas. I would try it again if there existed an application that's needed for
Ruby.

This is the first part of a series of blogs about RbYAML. The next entry will talk more about Rubyfying the code quite much.

4 kommentarer:

Anonym sa...

Very pretty design! Keep up the good work. Thanks.
»

Anonym sa...

Very best site. Keep working. Will return in the near future.
»

Anonym sa...

Interesting website with a lot of resources and detailed explanations.
»

Anonym sa...

You should port dabo to jruby.