Ola Bini: Programming Language Synchronicity: juli 2007

söndag, juli 29, 2007

New York

I landed in New York this morning, and I'll stay until Thursday. If anyone feels like taking a beer and discussing programming languages (or something else), don't hesitate to mail me or comment on this blog.

lördag, juli 28, 2007

Languages of the future

Martin Fowler writes about one language, and neatly encapsulates what I think about the subject in the way I would have written it, if I were actually a good writer. Go read.

fredag, juli 27, 2007

FOSCON slides

I have been asked for the slides of my FOSCON presentation. They can be downloaded in PDF format from here: http://ologix.com/JRubyWhirlwindTour.pdf.

The IronRuby scoop

As you almost certainly now at this time, IronRuby was released the beginning of this week (read more here, here and here). Now, this of course begs a few question. But first things first: note that Microsoft have decided to host the project on RubyForge, and will do so sometime next month. Not only that, it's actually a very open source license called MsPL: Microsoft Permissive License. If you look at the license, you will see that it's almost exactly a BSD license, except that the patent restrictions have been clarified. Obviously, I'm not a lawyer, but I can say that this license seems very nice and will allow good usage of IronRuby in many cases.

The code base is strictly limited right now. The things that actually work is most of the core language structure, the Array and String classes and .NET Interop. Of course, that is an extremely large achievement to have this working so well. Also, the compiler seems to generate very good code and judging from microbenchmarks, it has the possibility of having very good performance. It's important to remember that many of the more important corner cases aren't supported yet, and it is these that usually drags the performance of Ruby implementations down. I'm confident that Lam and company can handle this, though.

Overall, I would say that this is looking very good. I would recommend .NET people to take a look at it, and also try to contribute. Very soon, you will be able to contribute code back to everything except the core compiler and DLR.

Finally, the question on everyones mind: when will IronRuby be finished? I don't know, but I think it can happen with 6-12 months. The concerns raised two months ago have been resolved internally by Microsoft, which makes IronRuby a very real and important project.

Should JRuby support 1.4.2?

Right now we're trying to decide if JRuby should upgrade from Java 1.4.2 to Java 5. There are some compelling reasons for this, but I'm not a 100% sure it's a good idea. Any comments from my readers?

In practical terms, this will mean that JRuby 1.0 will continue to be supported on 1.4.2, but new development will only work on Java 5 or higher. There is talk about using retrotranslator for handling 1.4.2 compatibility in later versions.

So. Please, comments and opinions!

onsdag, juli 25, 2007

ThoughtWorks at OSCON

I am at OSCON with Roy Singham, the founder of ThoughtWorks. We're here as sponsors. If you are here too - Roy likes nothing more than an argument about anything related to a) the merits of various languages, b) Agile versus waterfall, c) the best cell-phone. Seriously, if there is something interested that should be big for custom-app dev in the enterprise, tell Roy or the other ten ThoughtWorkers here about it and we'll help make it happen. Come by our booth at the left side of the expo hall if you want to find us.

Really Radical Ruby

I had a very good time at FOSCON III with the Portland Ruby Brigade yesterday. There were lots of entertaining talks too. I would say that my "lightning talk" wasn't really a lightning talk at all. Unless you count the speed of my talking... I managed to race through all my 22 slides - all of them shock full with information - in the time alloted to me. Hopefully people learned something from it.

Chad Wathington demonstrated Mingle which also was very nice.

John Lam showed us a taste of IronRuby, and also talked some about the implementation particulars that made certain things faster in IronRuby than on MRI. Interesting stuff, but I'm looking forward to the full talk tomorrow.

Alan McKean from GemStone showcased GemStone/JRuby - still a work in progress though. For those of you who don't know what GemStone is, think extremely powerful object persistence. And they're building a new version for Ruby, on top of JRuby. Very cool.

I realized that there are a few points about JRuby that haven't been emphasized enough, though, so here is there executive summary bullet points:

JRuby is totally Java compliant and runs on any Java.
JRuby is 1.0
JRuby supports Rails
ThoughtWorks offers commercial support for JRuby
JRuby performance is on par with the C implementation, on average.

tisdag, juli 24, 2007

OSCON: first tutorial day

Yesterday was the first tutorial day at OSCON. Due to some planning mistakes, I didn't get the correct conference pass, so I missed the first tutorial. After that was sorted out I proceeded to the second tutorial of the day: Advanced Techniques for Parsing, by Mark Dominus. Of course I knew that the code would be Perl, but that didn't disturb me so much, since I expected to see some advanced parsing techniques. This is where disappointment hit me. Maybe it was advanced Perl code used, but it was not in any way advanced parsing. The first 2 hours were spent implementing a recursive descent parser with 7 productions. After that, I decided that I wouldn't be learning anything from this presentation, and headed back to the hotel, which was good, since I got sick that afternoon and spent the rest of the evening slightly delirious in my bed.

But now I'm up and going again, sitting here waiting for the tutorial "Real World Grails" to begin. I'm looking forward to see how Grails is actually used, since the presentations I've seen on it usually just show scaffolding and simpler things.

I have also decided on the subject for tonights FOSCON, but the slides are not finished yet. And the topic I choose is kind of a cop out: JRuby Cavalcade is the title of the talk, and I will basically just run through loads of interesting and funny JRuby things until I run out of time or gets booed of stage. Hope to see you there!

lördag, juli 21, 2007

Will JavaScript be the next big language?

I guess most of you have read the latest happenings in the whole NBL-story. Steve Yegge have created a Rhino on Rails; a JavaScript version of Rails. I have meant to write about this for a few weeks, but I felt I wanted to think it through before writing about it.

Of course, it's really cool that Rhino on Rails exists, and may someday be open sourced. On the other hand, the current version have no counterpart to ActiveRecord - instead it works against internal Google data systems. This may make very good sense from the pov of Google, but will make it hard to gain traction outside of Google when/if the open sourcing happens. At that point, a ActiveRecord version written using JDBC and Rails should happen. I'm personally looking forward to it, because such an undertaking would be able to take more advantage of JDBC than for example ActiveRecord-JDBC can do. ActiveRecord done "right" so to speak, but in JavaScript.

But is JavaScript the language of the future? What does it have that separates it from other current languages - most notably Ruby? Well, at first glance the big difference is in the amount of implementations. JavaScript have a multitude of implementations, while Ruby has three (Ruby, YARV? and JRuby), with many more in the works. JavaScript though, have a incredible amount of implementations, most of them bad.

It's got a standard. It's got full closures and a few more neat futures. And it has prototype object orientation; a very different object model compared to the standard of inheritance based object systems like Java, Ruby and Smalltalk.

On closer inspection, the prototype based object system is not really such a big deal. In fact, you already have it in Ruby. Just use Object.new and Kernel.clone. Not very practical in Ruby, but you can do it without any problem. Remember that the difference between dup and clone is that clone will actually copy the metaclass of the object, while dup just makes the duplicate use the same real class.

I see some advantages that JavaScript has over Ruby, that can actually count as real advantages. The spec is arguably one. Several good implementations is another. It's easier to make JavaScript have good performance. And maybe most important: most developers already know it to some degree or other.

But will it be the NBL in some form or another? No, I don't think so, for the simple reason that to many developers hate it. Not because of the language itself, but due to the connection with bad browser implementations. This is a debt that it will be very hard for JS to get away from, and I think it will tip the point against JavaScript compared any other language poised for the position of being a big language.

Further, I really don't believe in the idea that there will be a next big language. Call me strange, but just because we've had this cycle two or three times doesn't mean it will continue like that. I believe development is qualitively different now compared to 15 years ago. The challenges we meet are different and demand different tools. One single language will not cut it. I think the future lies in layered languages with different properties.

JRuby at FOSCON 2007

So, I will attend and present at FOSCON 2007, which is arranged by the Portland Ruby Brigade (PDX.rb), and the theme is Really Radical Ruby. It's on Tuesday, more information here. It seems to be an interesting event, so please show up!

I have not yet decided what I'm going to talk about. JRuby will be involved in some way of course. Anyone have any request about what I should talk about?

onsdag, juli 18, 2007

OSCON and other conferences

Next week I'll be at OSCON in Portland. I won't be speaking, which means I'll probably be able to enjoy other peoples presentations instead. Hopefully I'll get to meet lots of people too. Say hi if you see me!

I'll be presenting at RailsConfEU in September too, and that is gearing up to be a really interesting conference. RailsConf in Portland was awesome, and if the EU version can match a 10th of that energy, it's going to be wonderful.

tisdag, juli 17, 2007

Back to JRuby regular expressions

It seems that this issue comes up every third month. After all the work we have done, we realize that regular expressions need some real work again. Our current solution works quite well. We have imported JRegex into JRuby, and done a whole slew of modifications to it. It runs well, have no issues with to deep regular expressions (Javas engine uses a recursive algorithm, making it stack overflow for certain inputs. Certain very common inputs, in say ... Rails. *sigh*).

But JRegex is good. It's not perfect though. It's slightly slower than the Java engine, it doesn't support everything in the Java engine, and conversely, it supports some things that Java doesn't support. The major problem is that we don't have MRI compliant multibyte support, and the implementation of our engine is wildly different compared to MRI's engine, and Oniguruma.

At some point we will probably just bite the bullet and do a real port of Oniguruma. But until such time comes, I have extracted our current regular expression stuff, and put everything behind a common interface. What that means is that with the current trunk, you can actually choose which Regular Expression engine you want to use. You can even write your own and plug in. The interface is really small right now. At the moment we only have JRegex and Java, and the Java engine doesn't pass all tests (I think, I haven't tried, since that wasn't the point of this exercise.). Anyway; it means you can have Java Regular Expressions if you want them, right in your JRuby code. But only where you want them. So, you can regular which engine is used globally by doing one of these two:

jruby -J-Djruby.regexp=java your_ruby_script.rb
jruby -J-Djruby.regexp=jregex your_ruby_script.rb

The last is current the default, so it's not needed. In the future it may be possible that JRegex isn't the default though, but this options should still be there. But the more nice thing about this is also that you can use Java Regexps inline, even if you want to use JRegex for most expressions:

begin
 p(/\p{javaLowerCase}*/ =~ "abc")
 p $&
rescue => e
 p e
end

p(/\p{javaLowerCase}*/j =~ "abc")
p $&

Now, the first example will actually raise a RegexpError, because javaLowerCase is not a valid character class in JRegex. But not the small "j" I've added to the second Regexp literal! That expression works and will match exactly as you expected.

måndag, juli 16, 2007

RSpec and RBehave runs on JRuby

I'm not sure if this is well known or not, so I've been meaning to write a quick notice about it. The short story is this: JRuby can run RSpec and RBehave. Why is this important? Well, you can write code that tests Java code using RSpec and RBehave, meaning that it will be possible to get much more readable tests, even for code living in Java land.

Even if your organization won't accept writing an application in Ruby, it would probably be easier to get the testing done in Ruby. And writing tests in an effective language means that you will either write more production code, or more tests. Either of those are a quite good outcome.

A quick example of this in action. To run this example, you need JRuby 1.0 or later, and the rspec gem:

require 'java'

describe java.util.ArrayList, " when first created" do
before(:each) do
  @list = java.util.ArrayList.new
end

it "should be empty" do
  @list.should be_empty
end

it "should be able to add an element" do
  @list.add "content"
end

it "should raise exception when getting anything" do
  lambda{ @list.get 0 }.should raise_error(java.lang.IndexOutOfBoundsException)
end
end

In this code the examples are not that realistic, but you can see that the RSpec code looks the same for Java code, as it does for Ruby code. Even the raise_error exception matcher works. You can run this:

jruby -S spec -fs arraylist_spec.rb

The RBehave test suite also runs, which means you can stop using JBehave now... =)

This is a perfect example of the intersection where JRuby's approach can be very powerful, utilizing the existing Ruby libraries to benefit your Java programming.

The results of JRuby compilation

If you are interested in what actually happens when JRuby compiles Ruby to Java bytecode, I have added some small utilities to help out with this. To compile a string:

require 'jruby'
JRuby::compile("puts 1,2,3")

If you are running with -J-Djruby.jit.enabled=false, you can also inspect the result of compiling a block:

require 'jruby'

JRuby::compile do
 puts "Hello World"
end

The results of both of these invocations will be an object of type JRuby::CompiledScript. It has four attributes: name, class_name, original_script and code. The original_script attribute is only available when compiling from a string. The code attribute contains a Java byte array, and as such is not so useful in itself. But you can use the inspect_bytecode method to get a string which describes the compiled class. So, to see how JRuby compiles a puts "Hello, World":

require 'jruby'

puts JRuby::compile(<<CODE).inspect_bytecode
  puts "Hello, World"
CODE

Once you know what happens, you can start contributing to the compiler! =)

JRuby Inside

Peter Cooper have opened a "sister" site to RubyInside, called JRubyInside. It seems very promising; the address is http://www.jrubyinside.com.

Closing over ZSuper

One of the features of Ruby which I sometimes like and sometimes hate, is ZSuper. (So called, because it differs from regular super in the AST.) ZSuper is the keyword super, with arguments and parenthesis, which will call the super method with the same arguments as the current invocation got. Of course, that's not all. For example, if you change the arguments, the changes will propagate to the super implementation. Not only if you change the object, but if you change the reference, which I found non intuitive the first time I found it.

That's all and well. The interesting thing happens when you close over the super call and return it as a Proc. I haven't seen anyone doing this, which I guess is why there seems to be a bug in the implementation. Look at this code and tell me what it prints:

class Base
def foo(*args)
 p [:Base, :foo, *args]
end
end

class Sub < Base
def foo(first, *args)
 super
 first = "changed"
 super
 proc { |*args| super }
end
end

Sub.new.foo("initial", "try", :four).call("args","to","block")

Notice that Base#foo will get called three times during this code. In Sub#foo we are changing the first argument to the new string "changed". As I told you before, the second super call will actually get "changed" as the first argument the second time. But what will happen after that? We first create a block that uses ZSuper. We send the block to proc, reifying the block into an instance of Proc, and returning that. Directly after returning the block, we call it with some arguments. Now, the way I expect this to work (and incidentally, that's the way JRuby works) is that the output should be something like this:

[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", "try", :four]

We see that the first argument changed from "initial" to "changed", but otherwise the result is the same; the closure is a real closure over everything in the frame and scope. I guess you've realized that the same isn't true for Ruby. Without further ado, this is the output from MRI 1.8.6:

[:Base, :foo, "initial", "try", :four]
[:Base, :foo, "changed", "try", :four]
[:Base, :foo, "changed", ["args", "to", "block"], false]

The first time I saw this, the words WTF passed through my mind. In fact, that still happens sometimes. What is happening here? Well, obviously, it seems as if the passing of arguments to the block somehow clobbers the part where MRI saves away the closure over passed arguments. I have no idea whatsoever what the false value comes from. Hmm. But now that I think about it (this is just a guess), but I believe it stands for the fact that the arguments should be splatted into one argument. (That's the one called args in the block). If it had been true, they should refer to different variables. I think there is some trickery like that involved in the splatting logic in MRI.

Anyway. Is this a bug or a feature? I can't see any way it could be used in an obvious way, and it runs counter to being understandable and unsurprising. Anyone who can give me a good example of where this is useful behavior?

fredag, juli 13, 2007

A JRuby Rubinius machine

When I get bored with JRuby, I tend to go looking either at other languages or other language implementations. This happened a few days ago, and the result is what I will here document. Begin by creating a file called fib.rb:

def fib(n)
 if n < 2
   n
 else
   fib(n - 2) + fib(n - 1)
 end
end

p fib(15)

The next part requires that you have a recent version of Rubinius installed:

rbx compile fib.rb

This will generate fib.rbc. Next, take a recent JRuby version and run:

jruby -R fib.rbc

And presto, you should see 610 printed quite soon. This is JRuby executing Rubinius bytecode. I was quite happy about how it was to get this far with the functionality. Of course, JRuby doesn't support most bytecodes yet, only those needed to execute this small example, and similar things. We are also using JRuby's internals for this, which means that Rubinius MethodContext and such are not available.

Another interesting note is that running the iterative Fib algorithm like this with -J-server is actually 30% faster than MRI.

This approach is fun, and I have some other similar ideas I really want to look at. The best part about it though, is that I got the chance to look at the internals of Rubinius. I hope to have more time for it eventually. Another thing I really want to do some day is implement a Jubinius, which should be a full port of the Rubinius runtime, possibly excluding Subtend. I think it could be very nice to have the Smalltalk core of Rubinius working together with Java. Of course, I don't have any time for that, so we'll see what happens in a year or two. =) Maybe someone else does it.

Evil JRuby

After my last post I got several comments about evil.rb. Of course I had evil.rb in mind when doing some of it, but I also forgot to describe the two most evil methods of the JRuby module: runtime and reference. The runtime method will return the currently executing JRuby runtime as a Java Integration, meaning you can get access to almost anything you want with it. For example, if you want to take a look at the global CacheMap (used to cache method instances):

require 'jruby'
JRuby::runtime.cache_map

Whoops. And that's just the beginning. Are you interested in investigating the current call frame or activation frame (DynamicScope in JRuby):

require 'jruby'
p JRuby::runtime.current_context.current_frame
a = 1
p JRuby::runtime.current_context.current_scope

Of course, you can call all accessible (and some inaccessible) methods on these objects, just like if you were working with it from Java. Use the API's and take a look. You can change things without problem.

And that also brings us to one of the easiest examples of evil.rb, changing the frozen flag on a Ruby object. Well, with the reference method, that's easy:

require 'jruby'

str = "foo"
str.freeze

puts str.frozen?
JRuby::reference(str).setFrozen(false)
puts str.frozen?

JRuby::reference will return the same object sent in, wrapped in a Java Integration layer, meaning that you can inspect and modify it to your hearts like. In this way, you can get at the internals of JRuby in the same way you can using evil.rb for MRI. And I guess these features should mainly be used for looking and learning about the internals of JRuby.

So, have fun and don't be evil (overtly).

onsdag, juli 11, 2007

Some JRuby tricks

I have spent a few hours adding some useful features these last days. Nothing extraordinary, but things that might come in handy at one point or another. The problem with these features is that they are totally JRuby specific. That means you could probably implement them for MRI, but noone has done it. That means that if you want to use it, beware. Further, they exploit a few tricks in the JRuby implementation, meaning it can't be implemented in pure Ruby.

So, that was the disclaimer; now onto the fun stuff!

Breaking encapsulation (even more)
As you know, in Ruby everything is accessible in some form or another, and you can do almost everything with the metaprogramming facilities. Well, except for one small detail which I found out while working on the AR-JDBC database drivers.

We have some code there which needs to be separate for each database, and it just so happens that core ActiveRecord have already implemented them in a very good way. So, what do we do? Mix in them and remove the methods we don't want? No, because ActiveRecord adapters are classes, not modules, and you can't mix in classes. There is no way to get hold of a method and add that to an unrelated other class or module. Except if you're on JRuby, of course:

require 'jruby/ext'

class A
def foo
puts "A#foo"
end
def bar
puts "A#bar"
end
end

class B;end

class C;end

b = B.new
b.steal_method A, :foo
b.foo
B.new.foo rescue nil #will raise NoMethodError

C.steal_methods A, :foo, :bar
C.new.foo
C.new.bar

Of course, using this should be avoided at all costs. But it's interesting that such a powerful thing can be implemented using about 15 lines of Java code.

Introspection
JRuby parses Ruby code into an Abstract Syntax Tree. For a while now, the JRuby module have allowed you to parse a string and get the AST representation by executing:

require 'jruby'

JRuby.parse "puts 'hello'", 'filename.rb', false

This returns the Java AST representation directly, using the Java Integration features. That is old. What is new is that I have added pretty inspecting, a nice YAML format and some navigation features which makes it very easy to see exactly how the AST looks. Just do an inspect or to_yaml on an AST node and you will get the relevant information.

That is interesting. But what is even more nice is the ability to run and use arbitrary pieces of the AST (as long as they make sense together) and also run them:

require 'jruby'

ast_one = JRuby::ast_for("n = 1; n*(n+3)*(n+2)")
ast_two = JRuby::ast_for("n = 42; n*(n+1)*(n+2)")

p (ast_one.first.first + ast_two.first[1]).run
p (ast_two.first.first + ast_one.first[1]).run

As you can see, I take two fragments from different code, add them together and run them. You can also see that I'm using an alias for parse here, called ast_for. That makes much more sense when using the second parse feature, which we already know from ParseTree:

require 'jruby'

JRuby::ast_for do
puts "Hello"
end

Well, I guess that's all I wanted to show right now. These last small things I've added because I believe they will be highly useful for debugging JRuby code.

I also have some more ideas that I want to implement. I'll keep you posted about it.

lördag, juli 07, 2007

ObjectSpace: to have or not to have

Among all the features of Ruby that JRuby supports, I would say that two things take the number one place as being really inconvenient. Threads are one; making the native threading of Java match the green threading semantics of Ruby is not fun, and it's not even possible for all edge cases. But that argument have been made several times by both me and Charles.

ObjectSpace now, that is another story. The problems with OS are many. But first, let's take a quick look at the most common usage of OS; iterating over classes:

ObjectSpace::each_object(Class) do |c|
  p c if c < Test::Unit::TestCase
end

This code is totally obvious; we iterate over all instances of Class in the system, and print an inspected version of them if the class is a subclass of Test::Unit::TestCase.

Before we take a closer look at this example, let's talk quickly about how MRI and JRuby implements this functionality. In fact, having this functionality in MRI is dead easy. It's actually very simple, and there are no performance problems of having it when it's not used. The trick is that MRI just walks the heap when iterating over ObjectSpace. Since MRI can inspect the heap and stack without problems, this means that nothing special needs to be done to support this behavior. (Note that this can never be safe when using a real threading system).

So, the other side of the story: how does JRuby implement it? Well, JRuby can't inspect the heap of course. So we need to keep a WeakReference to each instance of RubyObject ever created in the system. This is gross. We pay a huge penalty for managing all this stuff. Many of the larger performance benefits we have found the last year have revolved around having internal objects be smarter and not put themselves into ObjectSpace until necessary. One of my latest optimizations of regexp matching was simple to make MatchData lazy, so it only goes into OS when someone actually uses it. RDoc runs about 40% faster when ObjectSpace is turned off for JRuby.

So, is it worth it? In real life, when do you need the functionality of ObjectSpace? I've seen two places that use it in code I use every day. First, Rails uses it to find generators, and secondly, Test::Unit uses it to find instances of TestCase. But the fun thing is this; the above code is almost exactly what they do; they iterate over all classes in the system and checking if they inherit from a specific base class. Isn't that a quite gross implementation? Shouldn't it be possible to do something better? Euhm, yes:

module SubclassTracking
  def self.extended(klazz)
    (class <<klazz; self; end).send :attr_accessor, 
                                    :subclasses
    (class <<klazz; self; end).send :define_method, 
                                :inherited do |clzz|
      klazz.subclasses << clzz
      super
    end
    klazz.subclasses = []
  end
end

# Where Test::Unit::TestCase is defined
Test::Unit::TestCase.extend SubclassTracking

# Load all other classes

# To find all subclasses and test them:
Test::Unit::TestCase.subclasses

I would say that this code solves the problem more elegantly and useful than ObjectSpace. There are no performance degradation due to it, and it will only effect subclasses of the class you are interested in. What's the best benefit of this? You can use the -O flag when running JRuby, and your tests and rest of the code will run much faster and use less memory.

As a sidenote: I'm putting together a patch based on this to both Test::Unit and Rails. ObjectSpace is unnecessary for real code and the vision of JRuby is that you will explicitly have to turn it on to use it, instead of the other way around.

Anyone have any real world examples of things you need to do with ObjectSpace?

Ola Bini: Programming Language Synchronicity