fredag, oktober 27, 2006

Why Open Source?

Sometimes, when discussing my interest and work with JRuby and other Open Source projects, people get all upset. It's really hard to justify for a non-programmer the benefits of Open Source, but sometimes it can be as hard to explain to corporate programmers. "What? You give away code for free, for anyone to use?". Not only that, I also do it mostly in my spare time.

I'm not planning for this blog entry to be some kind of manifesto. I'll just detail the (very logical) reasons why I do what I do.

At work
I work at Karolinska Institutet (KI) in Sweden. KI is a University, and like all Universities in Sweden, we are part of the government. That means that most of our funding is tax-based. We use lot of Open Source in our work, providing services to the campus. We run mostly on Linux servers, we use Open Source frameworks when building internal systems. Needless to say, we have saved uncountable millions with this strategy. And the Swedish government did a review a few years ago that recommended that all tax-funded organizations should use Open Source software is possible. From this perspective, it is very rational to try to give something back. We mostly do this by trying to release everything we develop internally that can be repackaged for distribution easily. We also allocate some time for all our developers to work on Open Source projects of their choice. Needless to say, I mostly use this time to work on JRuby.

In my spare time
Rationalizing Open Source from a corporate perspective is quite easy. It's harder to answer why I also do it in my spare time. I spend on average about two-three hours a day on Open Source, with about 80%-85% of that on JRuby. So why?

First of all, I recognize that Open Source is important in itself. I firmly believe that the current market model for software will soon disappear. It is obvious that the classical, shrink-wrapped model doesn't really work. I want to make that happen faster. Contributing to Open Source projects make that happen.

Secondly, JRuby is important. Ruby is great, but it has it's short comings. I believe that JRuby will be the necessary bridge between the two camps that Java and Ruby seems to move towards. JRuby will bridge the gap and give crucial capabilities to both platforms, and that can't happen soon enough.

Third, I believe it's important to hone my skills. Doing software development means always have to become better and better in your areas, and new areas. This is hard to do in the confines of regular corporate culture. Open Source is practice for me. It makes me better and I learn crucial new tools and techniques. Reading other peoples code is an excellent way to learn, and that's easy with Open Source.

Fourth. And most importantly; I'm a coder. This is my passion. Code is art and coding is what I really like doing. I have code in my head always, 24 hours. I dream about code. I design software more or less irrespectively of what my front lobes are engaging in. I'm getting more and more that my subconscious uses programming languages to get around Sapir-Whorf, helping me think from other angles.

Charles wrote about this a few months ago, here. He describes mostly what I feel about coding.

Of course I have other interests. I'm very musical, creating and listening to music constantly. I'm very fond of Go. I read incredibly much (too much if you ask some). But coding is what I like doing best, and when you get down to it, that's why I do Open Source.

Speaking engagements

This Monday (30:th October) I'll go to Malmö to present on JRuby at JavaForum, as I posted a few weeks ago. After everything is over I'll have an hour or two to spend before my train leaves, so if anyone is up for beer I'm game.

I will also present at JavaForum in Stockholm 21:th November. The presentation will be mostly the same, except for possible interesting developments during that time. At the same night Rob Harrop of Interface21 will present on Spring, AOP and ActiveJ. That promises to be very interesting, so I recommend everyone in the region to get there. =)

More information at http://javaforum.se.

onsdag, oktober 25, 2006

OpenSSL status report

I just checked in a few updates to my openssl branch for JRuby. Boy is it tricky getting everything right. It seems like every DER format Java crypto emits differs from the OpenSSL DER output. And it's really incompatible. As an example I have been forced to reimplement the DER dumping for X509 certificates myself, and that's not the only place.

But the work is actually going forward; as fast as I can make it when I'm only doing this in my spare time and my regular work takes lots of time right now. I can't say for sure when it will be finished or usable, but I know for a fact that most of the MRI tests run now. What's missing is PKCS#7, X509 CRL's and X509 cert-stores, plus the regular SSL socket support. Not much, compared to what actually works.

But that leads to me to two issues. We have recently agreed that OpenSSL support will require BouncyCastle and Java 5. There is really no other way to get this working. 1.4.2 is fine for basic Digest support and some of the RSA/DSA support, but Java is sorely lacking in the ASN.1 and X509 department. Nothing whatsoever. Which is why we need BouncyCastle, which is fairly complete. I have only been forced to reimplement one or two central classes. Quite good. But SSL support is another story. As you may know, 1.4.2 has SSLSocket and SSLServerSocket. The problem is this: they aren't any good. As a first, they are blocking, and there isn't any support in 1.4.2 for NIO SSL sockets. Whoopsie. Which explains the requirement on Java 5. Tiger adds the SSLEngine class which can be used to implement NIO SSL, with the caveat that it heightens complexity. I have only taken a cursory look at this yet. Right now I want the other stuff working first, since there are so many dependencies on them.

But it's really going forward. Now, if I only had this as my day job, this would be finished in a few days... Alas, that's not the way it is. Expect further updates in a week or two.

lördag, oktober 21, 2006

Meta-meta-information?

It seems that the current trend in Internet services is all the rage about meta information; ways to find the information you want. These services come in various guises; some collecting pictures or movies, collecting and classifying links or finding out what other people are looking at. The quintessential application right now must be the search engine and Google is in the forefront with this, as well as with many other current meta information processing applications. But what is the next level? The next evolutionary step in information processing?

Is meta-meta-information enough?
The obvious answer would be that we need meta information about the meta services; a way to search, handle and collect this information. Which meta services are the people I admire using? But is that really enough? It places a big burden on the individual trying to keep up with the information flow. Tagging and such helps the situation a bit, but it's still way to much for one person. I already feel slightly overwhelmed right now, trying to keep up with everything that happens in my current domains of interest at the moment. It seems to me that meta services for finding meta information wouldn't be enough, since it doesn't solve the underlying problem of actually helping to manage the information.

Are there any alternatives?

The development in services from this point on seems to revolve around two solutions; the semantic web, RDF and all those shenanigans, and intelligent agents.

The semantic web is a collection of technologies - some that exist and some that don't, yet - that enables more collaboration and interaction through Internet. Typical examples include ontologies for classifying information, two-ended links, user annotations and other ways to include self-description in the data itself.

Intelligent agents are probably the easiest AI application to rationalize, since it doesn't require strong AI, and it's obvious how helpful they would be. Intelligent agents are typically cast in the same box as semantic web, but I feel that intelligent agents (IA from now on) could probably exist and be effective without semantic web metadata. What makes IA more or less necessary very soon, is that automating handling of information is the solution to the meta-meta-problem. Most of my current meta information handling is done with the help of programs that I have customized in various ways to make information easier for me to handle, but manually customizing handling of meta meta information suffers from the same problem as manually tagging spam; it is just too much information. IA could help, by first getting to know the habits and interests of it's user, and then extrapolating what information would be useful and what could be thrown away.

Finalities
Is semantic web or IA enough in it's own right? Could the one solve the problem without the other? I think not, or at least not solve it permanently. I believe semantic web tech will enter Internet before IA, but I don't believe the information flow will be solved without them. At most, semantic web will buy us one or two years to catch our breath. IA is interesting technology, and we really need it now, especially if are a knowledge worker.

JRuby 0.9.1 released

We in the JRuby team are proud to announce the release of JRuby 0.9.1.

Download: http://dist.codehaus.org/jruby/.

This release contains numerous new features, fixes and improvements:
  • Overall performance is 50-60% faster than JRuby 0.9.0
  • Improved Rails support
  • New syntax for including Java classes into Ruby
  • New interpreter design
  • Refactoring of method dispatch, code evaluation, and block dispatch code
  • Parser performance enhancement
  • Rewriting of Enumerable, StringScanner and StringIO in Java
  • New experimental syntax for implementing interfaces
  • 86 Jira issues resolved since 0.9.0
This past months have been great for JRuby, and I know that it will get even better from now on. My personal goal for 0.9.2 is to have complete Java YAML support in, and a working OpenSSL library. Obviously, all bugs should be fixed too... =)

torsdag, oktober 19, 2006

The JRuby Tutorial #4: Writing Java extensions for JRuby

There are many reasons to write a Java extensions for JRuby. Maybe your favorite Ruby library hasn't been ported to JRuby yet, or you want to directly interface with some Java code without going through JRuby's Java interface. Maybe you need the speed from doing calculations in Java, or you just want to add missing functionality. Whatever the reason, writing extensions for JRuby can be tricky if you don't know how the internals of JRuby work. The purpose of this tutorial is to show how to build a simple extension the exercises many parts of the Ruby language and how to implement this with Java.

The example will be a module called Sequence with one class inside it called Sequence. Whenever I create something as a Java extension, I usually write functional Ruby code for doing it first, to get the structure of the code straight in my head. So, without further ado, here is the Sequence module:
module Sequence
def self.fibonacci(to=20)
Sequence.new(1,1,1..to)
end

def self.lucas(to=20)
Sequence.new(1,3,1..to)
end

class Sequence
include Enumerable
attr_reader :n1,:n2,:range
def initialize(n1,n2,range)
@n1, @n2, @range = n1,n2,range
regenerate
end
%w(n1 n2 range).each do |n|
define_method(n) do |v|
send("#{n}=",v)
regenerate
end
end
def regenerate
@value = []
v1, v2 = @n1, @n2
@value << v1 if @range === 1
@value << v2 if @range === 2
3.upto(@range.last) do |i|
v1, v2 = v2, v1+v2
@value << v2 if @range === i
end
nil
end
def [](ix)
@range = ix..(@range.last) if ix < @range.first
@range = (@range.first)..(ix+1) if ix > @range.last
regenerate
@value[ix-@range.first]
end
def each(&b)
@value.each(&b)
end
def to_a
@value
end
def to_s
@value.to_s
end
def inspect
"#<Sequence::Sequence n1=#@n1 n2=#@n2 range=#@range value=#{@value.inspect}>"
end
end
end

Interfacing with the JRuby runtime

There are a few different ways to write extensions for JRuby. The difference isn't big from a functional viewpoint, but there is a definite gap in usability. I call the two major ways to implement an extension the MetaClass way, and the MRI way. The MetaClass subclasses the Java class that represent a Ruby class, called RubyClass, and implements some meta information methods and classes. The MRI way, in contrast, just creates the Ruby class in code, and adds methods to it in some static initializer. This tutorial will use the MRI way for two reasons; first, it's easier and doesn't require so many files and classes, and second, when porting MRI C extensions, the MetaClass way doesn't map very well to how MRI does things.

Project setup

To make the extension building as simple as possible, it helps to follow a few conventions. First of all, I'm going to call the extension "fib". I want my potential users to be able to require 'fib' and get all the good Sequence-functionality. To achieve this there are two things to keep in mind. First, the jar-file should be called fib.jar and put somewhere in JRuby's load path. Secondly, there should be a class called FibService that implements the BasicLibraryService
interface. For our purposes, FibService.java will contain all functionality, but in a realistic situation is makes sense to extract the functionality and let the library loader just set up the
environment. The skeleton for my FibService.java will look like this:
import java.io.IOException;

import org.jruby.IRuby;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
return true;
}
}

At this point the only imports needed are for IRuby, which is the main interface for the JRuby runtime, and the BasicLibraryService which provides the basicLoad method. The return value specifies if the service was loaded correctly or not.

Basic structure

I will start by adding the basic structure for our code; the Sequence module and class:
import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyModule;

import org.jruby.runtime.builtin.IRubyObject;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});
return true;
}
}
What this code does is to establish the Sequence module at the top level, and then define the Sequence class inside this module. We need to specify a super class for it, and this is what the
runtime.getObject()-call is about. Basically it's a shortcut for writing runtime.getClass("Object"). After we have defined the class, make it include Enumerable, and then create attribute readers for the 3 instance variables. Despite the name, newSymbol doesn't necessarily create a new symbol; it returns an existing if there is one.

Singleton methods

We're going to create the singleton factory methods before actually creating the implementation for the class. The new class looks like this:
import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}
}
This code contains a number of new things. First of all, our singleton methods needs implementations. Since we don't need any data associated for these methods, static Java-methods suffice for implementation. A CallbackFactory is used to get a reflection handle at the methods. I use the method call getOptSingletonMethod on the CallbackFactory; this is because the one parameter to the two methods are optional, so the callback factory will look for a static method with signature IRubyObject name(IRubyObject, IRubyObject[]). We'll later see how we
can specify explicit types for method arguments. The recv argument is a specialty for static methods. Usually when working with Ruby instances from Java code, you will have a handle to the runtime implicit in the self, but this isn't possible for static methods. The recv parameter is the instance of RubyModule/RubyClass that the method is called on. In our case this is a handy way of getting hold of the Sequence-module.

All IRubyObject's have checkArgumentCount which is a simple utility method for methods with optional arguments. Basically, it takes an array, the minimum and maximum argument count, and throws a Ruby exception if it isn't correct. It also returns the actual argument count (which is the same as args.length right now). Note, if porting C Ruby code, that this two numeric parameters to checkArgumentCount is NOT the same as rb_scan_args where for example "12" means one required and two optional parameters. The equivalent with checkArgumentCount would be checkArgumentCount(args,1,3).

RubyNumeric has a few utility methods, where fix2int is one of the more useful. It basically allows us translate a Ruby integer into the Java corresponding type.

The most common types have shortcut creation methods in IRuby, and newFixnum is one of these. To create a new Range we have to get a reference to the class and call new on it, though.

The Sequence class

Here comes the meat of it all. This is the final version of the Java source:
import java.io.IOException;

import java.util.ArrayList;
import java.util.List;
import java.util.Iterator;

import org.jruby.IRuby;
import org.jruby.RubyArray;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;
import org.jruby.RubyObject;
import org.jruby.RubyRange;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

CallbackFactory seq_cb = runtime.callbackFactory(Sequence.class);
cSequence.defineSingletonMethod("new",seq_cb.getOptSingletonMethod("newInstance"));
cSequence.defineMethod("initialize",seq_cb.getMethod("initialize",RubyFixnum.class,RubyFixnum.class,RubyRange.class));
cSequence.defineMethod("n1=",seq_cb.getMethod("set_n1",RubyFixnum.class));
cSequence.defineMethod("n2=",seq_cb.getMethod("set_n2",RubyFixnum.class));
cSequence.defineMethod("range=",seq_cb.getMethod("set_range",RubyRange.class));
cSequence.defineMethod("[]",seq_cb.getMethod("arr_ix",RubyFixnum.class));
cSequence.defineMethod("each",seq_cb.getMethod("each"));
cSequence.defineMethod("to_a",seq_cb.getMethod("to_a"));
cSequence.defineMethod("to_s",seq_cb.getMethod("to_s"));
cSequence.defineMethod("inspect",seq_cb.getMethod("inspect"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}

public static class Sequence extends RubyObject {
public static IRubyObject newInstance(IRubyObject recv, IRubyObject[] args) {
Sequence result = new Sequence(recv.getRuntime(), (RubyClass)recv);
result.callInit(args);
return result;
}

public Sequence(IRuby runtime, RubyClass type) {
super(runtime,type);
}

public IRubyObject initialize(RubyFixnum n1, RubyFixnum n2, RubyRange range) {
setInstanceVariable("@n1",n1);
setInstanceVariable("@n2",n2);
setInstanceVariable("@range",range);
regenerate();
return this;
}

public IRubyObject set_n1(RubyFixnum n1) {
setInstanceVariable("@n1",n1);
regenerate();
return n1;
}

public IRubyObject set_n2(RubyFixnum n2) {
setInstanceVariable("@n2",n2);
regenerate();
return n2;
}

public IRubyObject set_range(RubyRange range) {
setInstanceVariable("@range",range);
regenerate();
return range;
}

private void regenerate() {
List v = new ArrayList();
int v1 = RubyNumeric.fix2int(getInstanceVariable("@n1"));
int v2 = RubyNumeric.fix2int(getInstanceVariable("@n2"));
IRubyObject r = getInstanceVariable("@range");
if(r.callMethod("===",getRuntime().newFixnum(1)).isTrue()) {
v.add(getRuntime().newFixnum(v1));
}
if(r.callMethod("===",getRuntime().newFixnum(2)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
int l = RubyNumeric.fix2int(r.callMethod("last"));
for(int i=3;i<=l;i++) {
int tmp = v1;
v1 = v2;
v2 = tmp + v1;
if(r.callMethod("===",getRuntime().newFixnum(i)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
}
setInstanceVariable("@value",getRuntime().newArray(v));
}

public IRubyObject arr_ix(RubyFixnum ix) {
int index = RubyNumeric.fix2int(ix);
if(index < RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{ix,getInstanceVariable("@range").callMethod("last")}));
}
if(index > RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("last"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{getInstanceVariable("@range").callMethod("first"), getRuntime().newFixnum(index+1)}));
}
regenerate();
return getInstanceVariable("@value").callMethod("[]",
getRuntime().newFixnum(index -
RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))));
}

public IRubyObject each() {
Iterator iter = ((RubyArray)getInstanceVariable("@value")).getList().iterator();
while(iter.hasNext()) {
getRuntime().getCurrentContext().yield((IRubyObject)iter.next());
}
return getRuntime().getNil();
}

public IRubyObject to_a() {
return getInstanceVariable("@value");
}

public IRubyObject to_s() {
return getInstanceVariable("@value").callMethod("to_s");
}

public IRubyObject inspect() {
StringBuffer sb = new StringBuffer("#<Sequence::Sequence n1=");
sb.append(getInstanceVariable("@n1").toString());
sb.append(" n2=");
sb.append(getInstanceVariable("@n2").toString());
sb.append(" range=");
sb.append(getInstanceVariable("@range").toString());
sb.append(" value=");
sb.append(getInstanceVariable("@value").callMethod("inspect").toString());
sb.append(">");
return getRuntime().newString(sb.toString());
}
}
}

Compiling this and placing it in fib.jar on your load path will allow JRuby to use the code as if it was Ruby. Try it out.

Now, let's take the code in pieces. First of all, the initialization code defines the methods available and gives them a reflected implementation through CallbackFactory. We create a static inner class to hold the actualy implementation of the class. This isn't strictly necessary in this case, since we haven't associated any external state with the object, but it makes for cleaner separation and easier to understand code. Note that we need to have our own
new-implementation. This is one of the drawbacks with the MRI technique. When using MetaClasses you can define an allocateObject-method that automatically get's used by the runtime. Most of CallbackFactory's different getMethods-variants are used. This display how to have a fixed number of arguments with specific classes.

The initialize method just sets the instance variables and then call the method regenerate. Note that this isn't a Ruby method anymore. I didn't feel it was necessary to expose it, and using Java call semantics makes this slightly more efficient. Apart from that, there is nothing really strange in this code. I use the fact that you can create a new Ruby array from a list to make the regeneration of @value easier. But in most cases this is purely translated Ruby to JRuby-code. The only point where something strange is happening is in fact in the each-method. Handling blocks with JRuby in Java isn't always practical, so I tend to find it easier to refactor the Ruby code into something that calls yield specifically, by itself.

Conclusion

Implementing a Java extension for JRuby can be tricky, but the hard part is mostly to know what services are available where. By having the JRuby source code available it's easy to get a peek into the internals and find out more about those things that are problematic. Taking a look at how the core classes are implemented often give some hints on how continue, too. For example, RubyZlib, RubyYAML, RubyOpenSSL, RubyStringIO and RubyEnumerable are all mostly written in this style, and there are various examples of the different styles available.

If you need the speed or if it's more practical to implement the functionality in Java, I would say that writing an extension is fairly easy once you get started. The important thing to remember is to be sure what the interface should be, and implement everything else outside of JRuby, demarcating the interface from the implementation.

fredag, oktober 13, 2006

ResourceBundle pain.

I just found out something very disturbing and annoying. Java Property-files (and ResourceBundles in extension) will only load files in ISO-8859-1 format. Of course, you can use Unicode escape codes in the files, but that is not very convenient. Java is, over all, really good at internationalization, but this is one sore spot. And it doesn't seem to go away in Java 6 either. Isn't it highly ironic that ResourceBundles can't use regular UTF-8-files?

tisdag, oktober 10, 2006

OpenSSL in JRuby

This weekend I started work on OpenSSL in JRuby. It's a pretty big undertaking, and it's actually worse than I suspected from the beginning, so I'm going to tell a bit about my endeavours and have far I've gotten right now. First of all, if someone is interested in the work, I have set up a branch for this, since it will take time and be very big. The branch is called svn://svn.codehaus.org/jruby/branches/openssl.

The approach
I have investigated several variants of implementing OpenSSL in JRuby. One that seemed easy was to find a JNI-library that wraps OpenSSL in Java. But it seems there are no full scale versions implemented anywhere. I also checked around other script-on-JVM-languages to see how they had solved it, but I didn't find anything close to OpenSSL functionality. Which left me with the approach I've decided to try out: implement as OpenSSL compatible Java code as possible with JCE and JSSE. I'm convinced this is doable, but right now it feels quite hard.

The progress
So, how far have I gotten these last days? Pretty far, but not far enough. I'm basing the work on MRI's test suite for OpenSSL. From those I have test_digest.rb and test_cipher.rb passing all tests. This doesn't sound like much, but especially test_cipher was a pain to get running.

The plan from hereon is to get the utils.rb-file to load, which means implementing OpenSSL::PKey::RSA and OpenSSL::PKey::DSA, getting the basics of OpenSSL::X509 in place and also find a way to fix OpenSSL::ASN1. Oh well, I've got loads of time for this. Or not. =)

The problem
The real problem when implementing this, is the fact that Ruby's OpenSSL support is... Well, how shall I put it? Thin, you might say. It's basically a a wrapper around the C-library, which means that the disconnect when implementing this functionality with JCE is quite large. Just translating OpenSSL cipher names to the JCE equivalent is a challenge. But the big problem with the ciphers was initiating the key and IV (initialization vector). I have tried all the PBE solutions available, including the versions in BouncyCastle ending with "-OPENSSL". No luck.

The problem is that Ruby uses the function called EVP_BytesToKey, which, according to the documentation implements PKCS#5 1.5 with a few tricks up its sleeve. Not very nice. In the end I had to implement my own version of this to generate keys. And since I had to look like mad for this information, I will here give you the implementation to this function in Java. Just use the return value to initialize your own SecretKey-implementation and instantiate an IvParameterSpec and you should be set to go: (note, I release this into the public domain. And note, this is just quick, ported code to show the concept.)
    public byte[][] EVP_BytesToKey(int key_len, int iv_len, MessageDigest md, byte[] salt, byte[] data, int count) {
byte[][] both = new byte[2][];
byte[] key = new byte[key_len];
int key_ix = 0;
byte[] iv = new byte[iv_len];
int iv_ix = 0;
both[0] = key;
both[1] = iv;
byte[] md_buf = null;
int nkey = key_len;
int niv = iv_len;
int i = 0;
if(data == null) {
return both;
}
int addmd = 0;
for(;;) {
md.reset();
if(addmd++ > 0) {
md.update(md_buf);
}
md.update(data);
if(null != salt) {
md.update(salt,0,8);
}
md_buf = md.digest();
for(i=1;i<count;i++) {
md.reset();
md.update(md_buf);
md_buf = md.digest();
}
i=0;
if(nkey > 0) {
for(;;) {
if(nkey == 0) break;
if(i == md_buf.length) break;
key[key_ix++] = md_buf[i];
nkey--;
i++;
}
}
if(niv > 0 && i != md_buf.length) {
for(;;) {
if(niv == 0) break;
if(i == md_buf.length) break;
iv[iv_ix++] = md_buf[i];
niv--;
i++;
}
}
if(nkey == 0 && niv == 0) {
break;
}
}
for(i=0;i<md_buf.length;i++) {
md_buf[i] = 0;
}
return both;
}

måndag, oktober 09, 2006

JRuby presentation in Malmö

It is now final that I will be presenting JRuby at JavaForum in Malmö, the 30:th October. There is more information available at www.javaforum.se.

lördag, oktober 07, 2006

Effectiveness of automated refactoring.

The last few weeks have seen some discussion regarding refactoring tools for dynamic languages. The basic questions are if it is possible, and if so, how effective it would be. More information about the debate in question can be found here, here and here. I'm not really going into the fray here; I just wanted to provide my hypothesis on something tangential to the issue.

The interesting point is something Cedric said in his blog:
And without this, who wants an IDE that performs a correct refactoring "most of the time"?
The underlying assumption here is that there can actually exist a refactoring tool that works all the time. So, my question is this: "Can a Refactoring tool be 100% effective, (where effectiveness is defined as completely fulfilling the refactoring preconditions, postconditions and invariants and without introducing dangers or errors in the code)."

My hypothesis (and there will be no rigorous proof of my position) is based on one axiom:
Automated refactoring is a subset of the Church-Turing halting problem.
I have no direct proof for this position, but it seems intuitive to me, that for a refactoring to always be completely correct, you would need to know things about the program that isn't always entirely possible to predict only from code. In this case, test runs would be necessary, and in those cases the halting problem enters. Now, for a strongly, statically typed language you would have to go to some effort to actually produce a program that couldn't be safely refactored, but the possibility is still there.

One commenter on one of the blog entries above said that a precondition for 100% refactoring of Java would be that you didn't use reflection and other meta-tricks. But the problem is, to avoid the halting problem you would have to remove enough features of Java to make it into a Turing-incomplete language. And by then it wouldn't be usable for general purpose programming.

I see no way out of this dilemma, unless my axiom is wrong. But if it is correct, there can never ever be a 100% effective refactoring tool.

fredag, oktober 06, 2006

JRuby import

OK, for those of you who thought that importing a class into the current namespace by assigning a constant, here is a small implementation (based on include_class), that lets you use import like you do it in Java:
 require 'java'

class Object
def import(name)
unless String === name
name = name.java_class.inspect
end
class_name = name.match(/((.*)\.)?([^\.]*)/)[3]
clazz = self.kind_of?(Module) ? self : self.class
unless clazz.const_defined?(class_name)
if (respond_to?(:class_eval, true))
class_eval("#{class_name} = #{name}")
else
eval("#{class_name} = #{name}")
end
end
end
end

import java.util.TreeMap

x = TreeMap.new
x['foo'] = 'bar'
puts x

torsdag, oktober 05, 2006

JRuby progress

I thought that I should post a little notice about what's happening with JRuby right now. Most of this is available for you if you subscribe to FishEye for JRuby (http://fisheye.codehaus.org/changelog/~rss/jruby/rss.xml).

There are a few very nice innovations going on. The most important is probably that Java-integration has been improved quite substantially. Basically, if the package you need are in the java, javax, org or com packages, you can just refer to classes the same way you refer to them in Java (except for classes that doesn't have a capital initial letter, but you don't do that, do you?). Typical JRuby code could look like this:
 require 'java'

TreeMap = java.util.TreeMap
x = TreeMap.new(java.util.Comparator.impl { |m,o1,o2|
o1 <=> o2
})
Now, there are two new features in here, and one JRuby idiom that should be used in JRuby code. The first feature is to refer to classes by name, simply. The idiom is to import classes into the current namespace by assigning them to constants. The second interesting feature is the 'impl' method, that is available on all interfaces. This method will create an anonymous implementation of the interface, which will call the block when any of the methods in the interface is called. The method name goes in the parameter 'm' in the example. This allows a very clean syntax for implementing one-method-interfaces, like in the example above. Before these we're added, you had to use import_class for each class, then create a class for the interface, and then use this class. The code would easily have doubled for this example.

A few days ago Charles and Tom made me committer in the JRuby project, which obviously feels very good, though slightly nervous. But I've been able to fix a number of very small bugs since then. The more interesting of these, in no order:
  • Correct implementation of default-inspect. (This isn't that interesting for implementations, but it makes debugging much easier)
  • Hash#each didn't yield properly in some edge-cases. This fix was submitted by Miguel, and applied by me.
  • I also added a fix to improve the Java method matching when the methods had a primitive in their arguments. This means that from now on, if you have two Java-methods, foo(float a) and foo(int a), and call it from Ruby with "foo(32)", the float-one won't get called anymore. Pretty nice.
  • A String#crypt implementation, which is based on some stuff I did years ago, which means there are no copyright issues.
  • The flash-issue with Rails (a message placed in the flash doesn't get removed). This was a Marshalling issue, which was quite easy to fix. Big win.
  • Stack overflow when calling non-implemented methods when subclassing an interface. It calls method_missing instead, now.
There are a few things being discussed on the list right now. We will soon commit a readline.rb that checks if the JNI-based GNU readline-bridge is present, and if so uses it. This means that if you want, you can have basic readline in JIRB.

We are talking about the best way to implement OpenSSL support too. This is quite a big thing, though, and as Tom put it "It will be pretty difficult but the person who does it will be worshipped". I'm not sure about the right way to go with implementation either. It seems the available OpenSSL JNI-implementations aren't good enough, so the best route seems to be JSSE. If you have any opinions or suggestions, please get in touch.

In other news, it seems I'm going to be talking about JRuby at JavaForum in Malmö. The date is not final but it seems likely to be the 23 October. If you're in Malmö, please stop by. I'll post more information as soon as everything is clear.

tisdag, oktober 03, 2006

Announcing ActiveRecord-JDBC 0.2.2

Version 0.2.2 of ActiveRecord-JDBC have now been released. It contains numerous smaller bug fixes, but more importantly the support for MimerSQL. The internals have been slightly refactored to allow easier change of database specific instructions further down the road.

The release can be found at http://rubyforge.org/frs/?group_id=2014 or installed through RubyGems.

måndag, oktober 02, 2006

The JRuby Tutorial #3: Playing with Mongrel

This part of the tutorial will be based on some slightly not-released software, but since it is so cool, I bet you will try it anyway. Basically, what I'm going to show you is how to get Mongrel 0.4 working with JRuby, and then how you can serve your JRuby on Rails-application with said version of Mongrel.

What you'll need
First of all, check out the latest trunk version of JRuby. There are some smoking new fixes in there that is needed for this hack. Next, you will also need to check out the 0.4-branch of Mongrel. This can be done with the following command:
svn co svn://rubyforge.org/var/svn/mongrel/branches/mongrel-0.4
You need to manually copy two parts of mongrel into your JRuby home. If $MONGREL_SRC is the name of the directory where you checked out mongrel, these commands will suffice:
cp -r $MONGREL_SRC/lib/mongrel* $JRUBY_HOME/lib/ruby/site_ruby/1.8
cp $MONGREL_SRC/projects/gem_plugin/lib/gem_plugin.rb $JRUBY_HOME/lib/ruby/site_ruby/1.8
echo '#\!/usr/bin/env jruby' > $JRUBY_HOME/bin/mongrel_rails
cat $MONGREL_SRC/bin/mongrel_rails >> $JRUBY_HOME/bin/mongrel_rails
chmod +x $JRUBY_HOME/bin/mongrel_rails
You will need to download the JRuby-specific http11-extension library. This can be downloaded here, and should also be put in the $JRUBY_HOME/lib/ruby/site_ruby/1.8-directory.

You're now set to go.

Simple web hosting
I will now show how to set up at small web server, that can serve both files and servlets. There really isn't much to it. First of all, we need to include some libraries:
require 'mongrel'
require 'zlib'
require 'java'
include_class 'java.lang.System'
Next step is to create a simple HttpHandler (which is like a Servlet, for you Java-buffs):
class SimpleHandler < Mongrel::HttpHandler
def process(request, response)
response.start do |head,out|
head["Content-Type"] = "text/html"
results = <<-"EDN";
<html>
<body>
Your request:
<br/>
<pre>#{request.params.inspect}</pre>
<a href=\"/files\">View the files.</a><br/>
At:
#{System.currentTimeMillis}
</body>
</html>
EDN

if request.params["HTTP_ACCEPT_ENCODING"] == "gzip,deflate"
head["Content-Encoding"] = "deflate"
# send it back deflated
out << Zlib::Deflate.deflate(results)
else
# no gzip supported, send it back normal
out << results
end
end
end
end
Now, this handler basically just generates a bunch of HTML and sends it back. The HTML contains the request parameters. Just to show how easy it is to combine Java-output with Ruby-output, I have added a call to System.currentTimeMillis. This could of course by anything. The last part is to actually make this handler active also. To finalize, we also start the server:
@simple = SimpleHandler.new
@http_server = Mongrel::HttpServer.new('0.0.0.0',3333)
@http_server.register("/", @simple)
if ARGV[0]
@files = Mongrel::DirHandler.new(ARGV[0])
@http_server.register("/files", @files)
end

puts "running at 0.0.0.0:3333"

@http_server.run
If you start this script with:
jruby testMongrel.rb htdocs
you can visit localhost:3333 and expect to see some nice output.

Making it work with Rails
A prerequisite for this part is that you have a functional JRuby on Rails-application using ActiveRecord-JDBC. If that is the case, you just need to go your application directory and execute this command:
$JRUBY_HOME/bin/mongrel_rails --prefix "" start
and everything should just work.

So, that's it. JRuby on Rails, with Mongrel. Enjoy.