måndag, juni 12, 2006

A YAML dumper in Java

Today I've begun work on porting the RbYAML dumper to Java. The work will be done in three separate layers: the emitter, the serializer and the representer. I've just decided the structure for the emitter, but the basic premise is really simple since emitting is just about low level IO stuff.

The emitter will be a stack based finite state machine, and it actually closeley resembles the ParserImpl part of the loader. There are 17 states and they are represented as anonymous implementations of a simple interface. If this was a Java 5 project, I would've implemented it with an Enums, like this:

public enum EmitterState {
StreamStart {
public void expect(final EmitterEnvironment env) {
env.expectStreamStart();
}
}
}

with anonymous Enum subclasses for each state. A very fine solution, since I can just fetch the next state and execute it directly, instead of doing something like this:

STATES[STREAM_START].expect();

Alas, that's not the way it will be.

One of the more interesting parts of this implemention will be finding a nice way to represent the options hash from Syck and RbYAML 0.2. With standard Ruby keyword arguments, the options hash is very practical to use, instead of having to supply 17 different arguments - or even worse fact(17) different constructors to allow for default arguments. The correct way in Ruby is to have a defaults Hash, merge this with the arguments provided, and lookup the configurations options here. Obviously, I could do this with maps, but it's very cumbersome and requires more than one line to provide an option; something I would like to avoid.

The solution to this dilemma came from a part of Joshua Blochs Effective Java technical session at JavaOne. (I've written a little about it in my entry on JavaOne, day 2). Anyway, the solution is to provide an YAMLOptions class with a build factory method. I'm not totally finished with the syntax yet, but I'm thinking that a toYaml call could look something like this:

YAML.toYaml(theObj, YAMLOptions.build().useVersion(true).useDouble(true).indent(4));


There are a few benefits with this syntax. The first is obvious: it takes a whole lot of options to span this over more than one line, and if you want that many, maybe you should save this options-object as a constant somewhere? Another great benefit is type safety. This isn't really an issue in Ruby, since there are no casts, but in Java I like the guarentee that the indent-option is an integer. I can save the defaults inside YAMLOptions, and if I define an interface for the YAMLOptions someone can implement it, while providing more options if needed. For example, right now I'm sure that JRubyYAMLOptions will be implemented; probably as a subclass to YAMLOptions, with some new options that only the JRuby-specific representers will use. Oh how I would love to be able to specify that static method build()with an interface...

Inga kommentarer: