onsdag, augusti 08, 2007

Ruby is hard

I have been doing Ruby programming for several years now, and been closely involved with the JRuby implementation of it for about 20 months. I think I know the language pretty well, and I think I have for a long time. This has blinded me to something which I've just recently started to recognize. Ruby is really hard. It's hard in the same way LISP is hard. But there are also lots of gotchas and small things that can trip you up.

Ruby is a really nice language. It's very easy to learn and get started in, and you will be productive extremely quickly, but to actually master the language takes much time. I think this is underestimated in many circles. Ruby uses lots of language constructs which are extremely powerful, but they require you to shift your thinking - especially when coming from Java, which many current Ruby converts do.

As a small example of a gotcha, what does this code do:
class Foo
attr_accessor :bar
def initialize(value)
bar = value

puts Foo.new(42).bar
If your answer is that it prints 42, you're wrong. It will print nil. What's the necessary change to fix this? Introduce self:
class Foo
attr_accessor :bar
def initialize(value)
self.bar = value

puts Foo.new(42).bar
You could argue that this behavior is annoying. That it's bad. That it's wrong. But that's the way the language works, and I've seen this problem in many code bases - including my own - so it's worth keeping an eye open for it.

Why does it happen? Well, the problem is this, when the Ruby parser finds a word that begins with an underscore or a lower case letter, and where that word is followed by an equals sign, there is no way for the parser to know if this is a method call (the method foo=) or a variable assignment. The fall back of the parser is to assume it is a variable assignment and let you specify explicitly when it should be a VCall. So what happens in the first example is that a new local variable called "bar" will be introduced and assigned to.

14 kommentarer:

Piers Cawley sa...

Any other behavior is still more unreasonable though.

Piers Cawley sa...

Damn, hit return in the wrong box.

The reason other behaviour is unreasonable can be seen in the following (admittedly bordering on the psychotic) code. If 'foo = whatever' were dispatched to #foo= if the object responded to it, you could have:

  class Foo
    def calculate(value)
      foo = value * value
      return foo + 10

  o = Foo.new
  o.calculate(10) # => 110
  Foo.attr_accessor :foo
  o.calculate(10) # => complains that nil doesn't know about *

which wouldn't really be very nice.

Ola Bini sa...

Nope. That would not be the case. I didn't mention it in the text, but BOTH attr_accessors will refer to either variable or vcall. So the second calculate would work fine, with the side effect of setting @foo each time it was used. Not perfect, of course.

On the other hand, if you're working with local variables that have the same names as method, you are bound to be confused in most cases. I would say that's a thing to avoid in almost all cases, since it makes your code hard to read.

Kevin Williams sa...

I would have used "@bar = value". I get "42" as the output when I do this. Am I doing something wrong? I've read that the attr_* methods implicitly create instance variables (@varname). Am I wrong?

Dave sa...

Attribute accessors don't so much create instance variables as reference them. That is, instance variables are just there (with a default value of nil) and accessors let you get at them via method calls./



Matthieu Riou sa...

Doing @foo= instead of this.foo= breaks encapsulation. Now it's arguable whether encapsulation is still interesting when you're inside the concerned class, in most cases you'll be safe.

Now say that you're using Rails and the said class is an ActiveRecord. Doing @foo= won't work because ActiveRecord relies on the accessors to know that state has been changed (and persist the modification). Small but important gotchas...

Btw Ola, I like the small touch of Swedish :)

KissTheGoat sa...

It should be possible for Ruby to warn you that the variable is shadowing an attribute, at 'syntax checking time' or with some other flag.

Pardon if this is double-posted.

toby sa...

I have ran into this gotcha too. I wanted to be able to do something like:
class Foo
attr_attribute :bar
def do_it &block
instance_eval &block

Foo.new.do_it {
bar = 10

And this would never work, because of the same problem. The best method I have resorted to so far is:
class Foo
attr_attribute :bar
def do_it
yield self

Foo.new.do_it {|f|
f.bar = 10

xmlblog sa...

> Doing @foo= instead of this.foo= breaks encapsulation.

I don't agree with this rigid view of encapsulation at all. By that logic the only place in a class allowed to access @foo would be the getter. At the very least, the constructor should not be chastised for setting the attribute directly. And as you stated, in most cases the rest of the code in the class can often do so as well unless you're being clever with your getters (ex: lazy initialization).

toby sa...

I was wondering how C# handled this case, so I checked it out. It turns out you CAN reference properties inside your class:
Bar = 10
but there's no ambiguity here because to declare a variable you have to specify the type first:
int Bar = 10

Mark sa...

That's not really a gotcha, at least not like (say) Perl has gotchas.

I rarely program in Ruby but I clearly remember reading about this case in the early chapters of the PickAxe book.

tn sa...

Hi Ola - this inspired a new quickfix for NetBeans :) See
I couldn't figure out how to leave a trackback instead.

James sa...

I was just wondering how this is much different from Java or any other programming language?

in java you would declare int foo; outside of the initializer, and trying to just call foo inside of the initializer wouldn't work, would it? You would need this.foo inside of the method too? Am I missing something though?

Dabisa sa...


I'm a Ruby newbie :)

However, I would use "@bar = value" instead of "bar = value" simply because @ means that this is instance variable. While "bar" (without @) is local variable of function (it is destroyed ater function is executed). Using "self.bar" seams to be the same thing as using "@bar".

Using "@bar" does not break encapsulation in any way. Encapsulation protects instance variables from being used outside of class and not from the class itself.