fredag, november 24, 2006

The FINAL OpenSSL post?


I've checked in all functionality I will add to OpenSSL support in JRuby at this point. Of course, there will be more, but not concentrated in a spurt like this. Tomorrow I will modify the build process and then merge everything I've done into trunk.

Let's back up a little. What have I accomplished? This: All OpenSSL tests from MRI run (except PKCS#7). That includes tests of SSL server and SSL client. Simple https-request also works. This is sweet. Everything else there is tests for in Ruby works. But... this is also the problem. Roughly half of Ruby's OpenSSL library is not tested at all. And since the current OpenSSL initiative from my part is based on tests, I haven't done anything that isn't tested for.

So, some things won't work. There is no support for Diffie-Hellman keys right now, for example. Will be easy to add when the time comes, but there isn't any testing so I haven't felt the need.

The only thing not there, as I said, is PKCS#7. That was just too involved. I'll take care of that some other time, when someone says they want it... Or someone else can do it? =)

So, what this boils down too is that JRuby trunk will have OpenSSL support sometime tomorrow. Hopefully it will be useful and I can get on to other JRuby things. I have a few hundred bugs I would like to fix, for example...

Oh yeah, that's true. Tomorrow will also be YAML day. I'll probably fix some bugs and cut a new release of JvYAML. It's that time, the bug count is bigger than it was, and JRuby needs some fixes. So that's the order of day for tomorrow. First OpenSSL and then YAML. Any comments on this, please mail or comment directly here.


onsdag, november 22, 2006

JavaForum: last night

And now it's over. Last night was the last presentation for me in a while. Over 200 people where there and the place was packed. I must say I'm very pleased with how my part went. I felt really good, much flow and the demonstrations went perfect.

Rob Harrop and Adryan Colyer talked about Spring 2.0 and AOP. It was fairly interesting, but since I've done some work with AOP before I felt that the level was a bit to introductory. No real meat.

JavaForum announced a day focused on Java in January, with four tracks and some great speakers. Looks very nice.

söndag, november 19, 2006

JRuby in Stockholm, JavaForum on Tuesday

On Tuesday it's time for the big JavaForum in Stockholm. I'm looking very much forward to it and have spent some hours today sharpening the presentation. Of course it will be very interesting to see Rob Harrop too.

In other news, something quite exciting is probably just around the corner...

JRuby presence at JavaOne 2007

The call for papers went out two days ago, and we already have several good proposals in line that will be submitted to JavaOne. Hopefully we will get many interesting talks this year. As far as I know, there are at least 5 JRuby-specific proposals submitted, and there will hopefully be more.

I am sending in two different proposals this year, one about JRubyEE (it's called "Agile Enterprise Development with JRuby"). That will be interesting to do and I really believe there is much for Java enterprise development to use in JRuby.

The second proposal is for a BOF about several JVM languages called "Lambda on the JVM: JRuby, Jython and Lisp". I'll talk about how to utilize different JVM alternative languages to do interesting and useful things. Since this is my venue, I will specialize on languages that are derived from Lisp (implicitly or explicitly). (I include both Ruby and Python in this category).

Hopefully at least one of the proposals will get through.

fredag, november 17, 2006

The obligatory meta-blog post.

So, this is post number 100, since I started blogging. And that's almost exactly one year ago. Isn't that interesting? I've been writing something here each 3.65'th day. Often it's been uninteresting, very often indifferent, but sometimes I have manage to write something that people liked. So, I'm happy about this small anniversary.

So, I plan to do what almost every blogger do at least once: turn the writing process inwards and investigate why people blog, why the person in question blogs, and perhaps, if we are really META, why the current blog post is actually written. Hopefully I'll know why I've written this by the end of this post. Otherwise, it'll probably be something indifferent or boring.

Let's begin. "Why blog?", the general question. When I started out, I had no real goal. I knew that I wanted to get better at technical writing and that was largely it. I hoped I would get some new contacts and interesting discussions but the base reason was for the writing ability. As such, I wouldn't have to write public; I could hone my writing skill in my basement and never publish anything to the (sometimes scary) scrutinizing eyes of the public. But that's the problem. If I didn't publish it, I wouldn't have as much incentive to become better. Now I absolutely have to get better at writing, otherwise I will feel ashamed about the filth that will for all time be remembered by Google.

That's reason number one. The second reason, that I've come to realize this last year but didn't know when I started out, is that I learn by writing about a subject. If it's a technical topic I basically have to research the topic quote thoroughly, so I won't look like a fool. And that is really good, that is incredibly good. I have learned very much by writing about things I find interesting. The first learning when I research and think about what I should write, and the second learning when people read it and comment and correct me.

Reason number three can be thought of as hubris. I believe that some of the things I know can be useful for other to know, and I believe that some of the problems and bugs I uncover in open software should be documented somewhere so others won't have to go through the same bug finding escapades as I did on that subject.

And here's where we come to the point. The point is that if you're working with software (or with anything where information sharing is important, really), you should write about. Because the writing benefits you and it benefits me. If you find a problem, write about it. Doesn't matter if someone actually reads it or not, for the most important aspects of writing is about yourself.

Steve Yegge wrote about this in his "Blog Or Get Of The Pot", and also in one of this older blogs ("Why you should blog" I believe it's called).

Coda: Why did I write this post? To make it clear in my head why I blog. To make it obvious and explicit why I do it. So that's what this post is about, and it's a recursive post since it talks about itself. I'm satisfied.

Some notes from the Stockholm Rails meet

Last night (Wednesday) about 35 developers with an unhealthy interest in Rails met up in Stockholm, at Valtech's offices, to share some experiences and talk shop.

It was very interesting and fun to meet people I can relate to. We ended up talking programming languages for a few hours after the main event ended.

Peter Marklund did a presentation on a CRM system in Rails, and Christian and Albert from Adocca talked about caching, and handling a Rails application that needs to scale into the millions. Nice stuff.

Last, I did an improvised presentation on LPW, a small Rails application that is fed its main data through Web Services instead of ActiveRecord. I believe it went fairly well, even though I had no slides and almost no preparation... =)

måndag, november 13, 2006

Speakings this and next week

Just a quick notice. I will be talking this wednesday at a Rails group in Stockholm. The subject will (for once) not be about JRuby, at least not primarily. Instead I will talk about an application called LPW, which is interesting since it doesn't use a database as the main data feed, but a bunch of Web Services that are implemented by a third party and deployed with EJB's in Axis. I had all kinds of challenges, but in the end the result became very good. I will talk a little about what you can expect when trying to interface with Java in this way, and good tricks on how to handle an alternative to ActiveRecord as model.

I've written a bit about LPW sometime this summer, if anyone is interested.

Next week is the JavaForum meet up in Stockholm, and I will speak after Rob Harrop. The subject is JRuby and the day is Tuesday. As far as I know, the meet up is full.

The presentation will be pretty standard JRuby fare for Java developers.

Further OpenSSL progress

Sometimes it feels like the progress I've done with the OpenSSL library is almost swallowed up by all the new functions I've found that needs to be implemented. It's an uphill battle.

implementation doesn't handle the But, I'm happy to say that the latest digression is finished. I have successfully implemented the X509_STORE-family of functions plus all dependencies. In the end I had to create a wrapper for X509Certificate and my own PEM-reader and PEM-writer, since BouncyCastle'sOpenSSL specific aux trust information that can be appended to these certificates.

But anyway, that means there is a Java implementation of X509_STORE, X509_STORE_CTX, X509_VERIFY_PARAM, X509_LOOKUP, X509_LOOKUP_METHOD, X509_TRUST, X509_PURPOSE and a whole slew of others too. About the only thing I didn't implement was X509_POLICY_TREE. That was just too much. I'll tackle that hurdle if it seems Ruby needs it.

So, what is the status then? Coming in at almost 10 000 lines of code there is only three real parts left to do. Or, three parts left that have MRI test cases, I should say. Since there are quite a few files not tested in the MRI implementation. Like DH keys. Wow. But I ignore that for now. The current goal is the OpenSSL::X509::Store-family, the OpenSSL::PKCS7-family, and the SSL-stuff, the rest of the way. So, wish me luck.

An Emacs diversion: Font sizes

After a few weeks of very intense JRuby-OpenSSL hacking, I felt the need to do something different, so I've spent a few hours with my slightly rusty Emacs Lisp skills, trying to fix something that I really need. Namely, control over font-size and fonts in Emacs, on Linux. I want it inside Emacs and customizable by EL. To my surprise I couldn't find anything like that anywhere.

For me personally, it's necessary when presenting, since I usually code with a small font in Emacs, the code will be totally unreadable when presenting. And since I don't have a fancy MacBook Pro, I need to be able to zoom in and out inside Emacs.

Presto, it wasn't easy, but I've managed it. For some reason, font handling seems quite backward in Emacs. I had to extract the current font, and then split it and join the new array together again. Not neat and my way of doing it is not the best. But, for your pleasure, here is the code to do it, and also some code that establishes a font ring of the standard fonts in different sizes that can be walked through:
(defun inc-font-size ()
(let* ((current-font (cdr (assoc 'font (frame-parameters))))
(splitted (split-string current-font "-"))
(new-size (+ (string-to-number (nth 7 splitted)) 1))
(new-font (concat (nth 0 splitted) "-"
(nth 1 splitted) "-"
(nth 2 splitted) "-"
(nth 3 splitted) "-"
(nth 4 splitted) "-"
(nth 5 splitted) "-"
(nth 6 splitted) "-"
(number-to-string new-size) "-*-"
(nth 9 splitted) "-"
(nth 10 splitted) "-"
(nth 11 splitted) "-*-"
(nth 13 splitted))))
(if (> (length splitted) 14)
(dotimes (n (- (length splitted) 14))
(setq new-font (concat new-font "-" (nth (+ n 14) splitted)))))
(set-default-font new-font t)
(set-frame-font new-font t)))

(defun dec-font-size ()
(let* ((current-font (cdr (assoc 'font (frame-parameters))))
(splitted (split-string current-font "-"))
(new-size (- (string-to-number (nth 7 splitted)) 1))
(new-font (concat (nth 0 splitted) "-"
(nth 1 splitted) "-"
(nth 2 splitted) "-"
(nth 3 splitted) "-"
(nth 4 splitted) "-"
(nth 5 splitted) "-"
(nth 6 splitted) "-"
(number-to-string new-size) "-*-"
(nth 9 splitted) "-"
(nth 10 splitted) "-"
(nth 11 splitted) "-*-"
(nth 13 splitted))))
(if (> (length splitted) 14)
(dotimes (n (- (length splitted) 14))
(setq new-font (concat new-font "-" (nth (+ n 14) splitted)))))
(set-default-font new-font t)
(set-frame-font new-font t)))

(defvar *current-font-index* 0)

(defconst *font-ring* '(
"-urw-nimbus mono l-regular-r-normal--15-*-88-88-p-*-iso8859-1"
"-urw-nimbus mono l-regular-r-normal--17-*-88-88-p-*-iso8859-1"

(defun font-next ()
(let ((len (length *font-ring*))
(next-index (+ *current-font-index* 1)))
(if (= next-index len)
(setq next-index 0))
(setq *current-font-index* next-index)
(message (concat "setting " (nth *current-font-index* *font-ring*)))
(set-default-font (nth *current-font-index* *font-ring*) t)
(set-frame-font (nth *current-font-index* *font-ring*) t)))

(defun font-prev ()
(let ((len (length *font-ring*))
(next-index (- *current-font-index* 1)))
(if (= next-index 0)
(setq next-index (- len 1)))
(setq *current-font-index* next-index)
(set-default-font (nth *current-font-index* *font-ring*) t)
(set-frame-font (nth *current-font-index* *font-ring*) t)))

(defun font-current ()
(cdr (assoc 'font (frame-parameters))))

(defun font-set (ix)
(setq *current-font-index* ix)
(set-default-font (nth *current-font-index* *font-ring*) t)
(set-frame-font (nth *current-font-index* *font-ring*) t))

(provide 'fontize)

I also bound these methods to keys, like this:
(global-set-key [?\C-+] 'inc-font-size)
(global-set-key [?\C--] 'dec-font-size)
(global-set-key [?\M-+] 'font-next)
(global-set-key [?\M--] 'font-prev)

Hope this helps someone in the same situation.

onsdag, november 08, 2006

Nooks and Crannies of Ruby

There are many small parts of Ruby, tips, tricks and strange things. I thought that I would write about some of the more interesting of these, since some of them are common idioms in the Ruby community. The basis for the information is as always from the Pick-axe, but how these things are used in real life comes from various places.

The splat operator

The asterisk is sometimes called the splat operator when not used for multiplication. It is used in two different places for opposite cases. When on the right hand side of an expression, it is used to convert an array into more than one right hand value. This makes splicing of lists very easy and nice to do.
a,b,c = *[1,3,2]
Second, it's used at the left hand side to collect more than one right hand value into an arra
*a = 1,3,2
This makes no difference if you're calling a method or assigning variables. What matters is as usual with programming languages; that there is a left hand side and a right hand side (lhs and rhs from now on):
def foo(a,*b)
p b

foo 1,2,3,*[4,5,6]
This is all old news, and not very exciting. It's useful and the basis for some niceties, but nothing overwhelming. The thing that is really nice about the rhs version of the splat operator is what it does if the value it's applied to isn't an array. Basically, the interpreter first checks if there is a to_ary-method available. If not, it goes for the to_a method. Now, Kernel has a default to_a-method so all objects will respond to to_a. This method is deprecated to call directly, though, but if called through splat or Kernel#Array it doesn't generate a warning. So:
a = *1
will result in the same thing as
a = 1
except for jumping through some unnecessary hoops underneath the covers. But say that you have an object that implements Enumerable and you want to do something with. Maybe transform a Hash into an array of 2-element arrays, you can do it like this:
*a = *{:a=>1,:b=>2}
Now, this still isn't that useful. Oh, it's slightly useful but there is a method in Hash that does this too. But say that we have a file object:
*a = *open('/etc/passwd')
Since File includes Enumerable, it also has a to_a method which creates the array by using each to iterate and collect all elements. In this case all the lines in the file.
def foo(*args)
Camping uses the splat operator at many places, mostly with the common idiom to take any arguments offered and passing them all on as separate arguments again:

Symbols and to_proc

I hesitate to use the word neat, but I can't really find anything that better describes the sweet, sweet combination of symbols and to_proc. I'm going to show you a small example of how it's used before I explain this very common practice:
[1e3,/(foo)/,"abc",:hoho].collect &:to_s
Now, this code will not run without a small addition to your code base. But first of all, let's just walk through the code. First we define a literal array that contains four elements of different type. One Float, one Regexp, a String and a Symbol. Then we call collect to make a new array out of this. But where we usually provide collect with a block, we instead see the ampersand that symbolizes that we want to turn a Proc-object into a block argument for a method. But what comes next is not a variable, but a symbol. So, what happens? Well, the ampersand checks if the value provided to it is a Proc, and if not it calls to_proc on the value in question, if such a method is defined. And how should this method look? Like this:
class Symbol
def to_proc
lambda { |o| o.send(self) }
Now, this method is nothing much. But it employs some fun trickery. It first creates a Proc by calling Kernel#lambda with a literal block. This block takes one argument, and the block calls the method send on the argument with itself as argument. As self in this case would be a symbol, and specifically the symbol :to_s in the above example, the end result is that the Proc returned will call to_proc on each object yielded to the block. So, with this explanation it's easier to understand what the first example does. In effect it is exactly the same as
[1e3,/(foo)/,"abc",:hoho].collect {|v| v.to_s}
but without that nasty duplication of the v-argument. It's not a big saving, but many small savings...

I recommend installing facets, which include numerous small, nice solutions like this. They can also be required separately, so if you have facets installed, just require 'facet/symbol/to_proc' to get this specific functionality included.

Using operators as method names

Ruby allows much more operators to be redefined than most languages. This makes some interesting tricks possible, but most importantly it can make your code radically more readable. An excellent example of this can be found in the net/ldap-library (available as ruby-net-ldap from RubyGems). Now, LDAP uses something called filters for searching, and the syntax for filters are basically prefix notation with ampersand, pipe and exclamation mark for and, or and not, respectively. Now, with the net/ldap-library you can define a combined filter like this:
include Net
f = (LDAP::Filter.eq(:cn,'*Ola*') & LDAP::Filter.eq(:mail,'*ologix*')) |
This defines a filter that basically says: find all entries where cn is '*Ola*' and mail is '*ologix*' or uid is 'olagus'. This is very readable thanks to the infix operators, that for everyone who knows LDAP will be easy to understand.

The next example comes from Hpricot, where _why puts the slash to good use:
doc = Hpricot(open(""))
(doc/"span.entryPermalink").set("class", "newLinks")
Note how neatly doc/"span..." fits in, and it looks like XQuery, or any other path query syntax. But it's just regular Ruby code and the slash is just method call. I'm really sad that /. isn't allowed as a method in this way... =)

Now, ackording to the Pickaxe, all of these infix operators will be translated from arg1 op arg2 into arg1.op(arg2). But Ruby still needs to be able to parse everything. This means that most operators need to have one required argument. Trying this with a home defined *-operator will not work:
x = a *
But, an experimental syntax for importing packages in JRuby actually used this effect:
import java.util.*
This is just a simple exploatation of the fact that * is a regular method name and used like this will be parsed by Ruby like that too, which means it doesn't need an argument. So, which operators are available for your leisure? Ackording to the Pickaxe, these are [], []=, **, !, ~, + (unary), - (unary), *, /, %, +, -, >>, <<, &, ^, |, <=, <, >, >=, <=>, ==, ===, !=, =~, !~.
Note that the method names when implementing the unary + and - is +@ and -@:
class String
def -@
The most important thing to remember when reusing operators like this is to not overdo it. Use it where it makes sense and is natural but not elsewhere. Remember that Ruby code should follow the principle of least surprise. The above example of using unary minus to return a swapcased version of the string is probably not obvious enough to warrant its use, for example.

Using lifecycle methods to simplify daily life

Inversion of control is all the rage in the Java world right now, but using callbacks of call kinds have always been a great way to make readable and compact. The Observer pattern is used in many places, and I suspect it's implemented without any knowledge of the pattern in most places.

Ruby contains a few callback methods and lifecycle hooks that make life that much easier for the Ruby library writer. Probably the most useful of these are Module#included. Basically, this is a method you define like this:
module Enumerable
def self.included(mod)
puts "and now Enumerable has been used by #{mod.inspect}..."
It will be called every time a module is included somewhere else.

There are other callbacks that can be useful. Module#method_added, Module#method_removed, Module#method_undefined and counterparts for Kernel with singleton prefixed. Class#inherited is interesting. Through this you can actually keep track of all direct subclasses of your class and with some metaprogramming trickery (basically writing a new inherited for each subclass that does the same thing) you can get hold of the complete tree of subclasses. If you want that for some reason. I would for example use this approach for Test::Unit, rather than iterating over ObjectSpace. But I guess that's a matter of taste.

Class variables versus Class instance variables

This is one thing that always trips people up. Including me. Class variables are special variables that are associated with a class. They are referenced with two at-signs and a name, like @@name. So far, it's simple. But classes are also instances of Class, which means that these instances can have regular one-at-sign instance variables. These are not the same thing. Not at all. Something like this:
class Foo
@@borg = []
@me = nil

def initialize
@me = self

def self.add_borg
@@borg << @me
will result in a @@borg-list filled with nils. This is because the first @me refers to an instance variable in the Foo instance of Class; not the @me instance variable associated with an instance of the Foo-class.

Condensed lesson: Class have instance variables of themselves, these are rarely useful; they usually contribute to hard-to-find-errors. And don't confuse them with class variables which is a totally different kind of beast.

Shortcuts: __FILE__ and ARGF

Ruby contains a myriad of shortcuts, many influenced from Perl and other invented to make it easier to write condensed programs. The regexp result globals are always good to have, but there are other that can be very useful too. Two that I like most are __FILE__ and ARGF. __FILE__ is also part of a very, very common idiom that the Pickaxe details. Combined with the global $0 it makes it easy to differ execution when a file is required, and when it's executed. Basically, $0 contains the name of the file that has been executed. In C this would be argv[0]. __FILE__ is the full filename of the file the code can be found in. If these are the same, the current file is the one asked to execute. This is useful in many places. I use it often in gemspecs:
if $0 == __FILE__
If I run the file above with gem build, this part will not execute, but if I execute the file directly, it will run.

Matz sometimes likes to show how to implement the UNIX utility cat in Ruby:
puts *ARGF
This combines tip number uno in this blog entry with the constant ARGF. ARGF is a nice special object that when you reference it will open all the files named in ARGV. If you have any options in your ARGV you'd better remove them before referencing ARGF, though. Basically what you get when referencing ARGF is a file handle to the files named on the command line. And since a File has Enumerable and thus to_a, splat will read all the lines in all the files and combine them into an array and then splay the array into the call to puts which will print each line. Here you are, cat!

There are other globals and constants available, but most aren't as useful as the previously named. For example you can use __END__ on an empty line, and the code interpolation will stop there and the rest of the file will be available as the constant DATA. I haven't seen anyone use this. It's a remnant from when Ruby was a tool to replace Perl, and the other scripting tools in UNIX.

Everything is runtime

Basically, the whole difference in Ruby compared to compiled languages is that everything happens at runtime. Actually, this difference can be seen when looking at Lisp too. In Common Lisp there are three different times when code can be evaluated: at compile-time, load-time and eval-time. In Java class-structure is fixed. You can't change class structure based on compile parameters (oh boy, sometimes I miss C-style macros). But in Ruby, everything is runtime. Everything happens at that time (except for constants... this is a different story). This means that class definitions can be customized based on environment. A typical example is this:
class Foo
include Tracing if $DEBUG
This class will include some methods when the -d flag is provided, and others when it's not. Basically there isn't much syntax in Ruby that couldn't be implemented in the language itself. A class declaration can be be duplicated with do
#class declarations go here
And almost all parts of a method-definition with def can be provided with define_method. The glaring mismatch (blocks) will be corrected with 1.9. Except for that, it's just sugar. If statements could be implemented with duck typing/polymorphism:
class TrueClass
def if(t,f)

class FalseClass
def if(t,f) if f

x = true

x.if lambda{ puts "true" }, lambda{ puts "false"}
And that's the real Lisp inheritage of Ruby. There really isn't any essential syntax. Everything can be implemented with the basics of receiver, message, arguments, and blocks. Just remember that. It's the basis for all useful metaprogramming. There is no compile-time. Everything can change. "There is no spoon".

Announcing ActiveRecord-Mimer 0.0.1

The initial version of ActiveRecord-Mimer have been released.

The project aims to provide complete ActiveRecord support for the Mimer SQL database engine. This initial release provides the basis for that. Most operations work, including migrations. The only exceptions are rename_column and rename_table which isn't supported by the underlying database engine.

The project resides at RubyForge:

and can be installed with RubyGems by
gem install activerecord-mimer

The code is released under an MIT license.

måndag, november 06, 2006

Another OpenSSL woe.

My interesting OpenSSL implementation exercise continues. I am now very close. Very, very close. I'm actually so close that SSLSocket and SSLServer actually works, provided that you use Anonymous Diffie-Hellman (which is suicidal, but that's another story). All of this have been submitted to my openssl-branch. What's missing is the X509-store and PKCS#7. And the X509-store doesn't really look good. Not good at all. It's needed for full SSL support. But the bad thing is this: there isn't any Java libraries that duplicate the functionality. Nada. At least not that I can find. The functionality needed is to read and write X509_store-formatted files and directories, to be able to add certificates and CRL's and to verify against these a certificate, based on various interesting OpenSSL rules.

I wouldn't say that I mislike OpenSSL. I wouldn't say that I hate it either. It's very impressive in many ways. But boy. It seems I have to port a substantial part of it to Java, and I'm not looking forward to it. I need to to do both a port, and add support for KeyStore and CertStore so the Java SSLEngine also can use the information. Will this be an interesting exercise? Oh yes.

So, without further ado, this is the plea of this blog post: If you know of any easier way to do this, please tell me. Now! (where "this" is the X509_STORE-family of functions.)

fredag, november 03, 2006

JRuby versus Camping: Round 2

As undoubtedly some of you have discovered while trying, JRuby doesn't run Camping anymore. The culprit is a small feature in Ruby that we don't support yet. For some reason this feature is the preferred idiom for option parsing and Camping 1.5 introduced it, which means that Camping 1.5 will fail miserably in JRuby right now. There are more or less two ways around it, though. The first one is easy; just use a Camping version that is
opts.on("-p", "--port NUM", "Port for web server (defaults to #{conf.port})")
{ |conf.port| }
To fix this particular instance, just change it into this:
opts.on("-p", "--port NUM", "Port for web server (defaults to #{conf.port})")
{ |v| conf.port=v }
Not much of a difference, really. On all places where assignment to a non-obvious node is done in the manner above, just replace it with a regular assignment, and Camping will run.

Now, we are planning on fixing this syntax, but there is much development going down right now, and this change requires some changes in the parser, which is always interesting. But it will be there.

torsdag, november 02, 2006

Introducing TIJuAVA - Java with Type Inference

Every time I've written Java code lately, I've been painfully aware of how much unnecessary code I write every time. And most of this is Java's fault. This blog post is a very small thought experiment. TIJuAVA does not exist as software. Yet. If I someday have the time I would love to implement it, but there are more pressing needs right now.

So, what are the rules? Basically, all valid Java programs are valid TIJuAVA programs. Some valid TIJuAVA programs are not valid Java programs. Simply put, the main difference is that you don't need to declare a type for any local variables or member variables. Type declarations are only necessary in method declarations. You can declare local variables and member variables if you want to, and in certain very unlikely circumstances you will need too.

Let's take a very simple example. This code is taken from the JRuby source code, but I have added one or two things to make it easier to showcase:
package org.jruby.util.collections;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;

public class IdentitySet {
private items = new ArrayList();

public void add(Object item) {

public void remove(Object item) {
iter = items.iterator();
while (iter.hasNext()) {
storedItem =;
if (item == storedItem) {

public boolean contains(Object item) {
iter = items.iterator();
while (iter.hasNext()) {
storedItem =;
if (item == storedItem) {
return true;
return false;

private Collection getItems() {
return items;

private void something(java.util.AbstractSet inp) {
val1 = inp;
for(iter = val1.iterator();iter.hasNext();) {
This code doesn't really show all that can be done with this approach, and if I were to show a real example, this blog would be unbearably filled with code. So, this is just a tidbit.

The TIJuAVA system would need to be implemented as a Java two-pass compiler. Basically, the first pass finds all variable names that need to have a type inferred, and then walks through the information it's got, basic on method signatures and methods called on the variable. In almost all cases it will be possible to come to one conclusion on which type to use. The compiler would then generate regular Java byte code, basically the same bytecode that would have been generated had you written the types by hand.

Of course, most people use IDE's to write code nowadays. Wizards and code generators and what not. So why something like this? Well, even though your IDE writes your code for you, it is still there, and you still have to understand it at some level. If not when writing, you would still need to read it. And boy does type declarations clutter things. Especially generics. And here is one interesting tidbit. Generic types would also be possible to infer in most cases.

Another thing that could be easily added is some kind of in-place literal syntax for lists and maps. This would be more like a macro feature, but the list syntax would mostly just be a call to Array.asList, which isn't to bad.

An objection that I anticipate is from people who think that the code will be less readable by removing the type pointers. This should be more of a problem when you have large methods, but everyone these days use refactorings so they won't have methods with a LOC over 20. And if that's the case, the local variables should be easily understood by the operations that are used on them.

So. Someday, when I have time, this may be reality. If anyone is interested, that is.