onsdag, september 20, 2006

Ruby Metaprogramming techniques

Updated: Scott Labounty wondered how the trace example could work and since a typical metaprogramming technique is writing before- and after-methods, I have added a small version of this.
Updated: Fixed two typos, found by Stephen Viles


I have been thinking much about Metaprogramming lately. I have come to the conclusion that I would like to see more examples and explanations of these techniques. For good or bad, metaprogramming has entered the Ruby community as the standard way of accomplishing various tasks, and to compress code. Since I couldn't find any good resources of this kind, I will start the ball running by writing about some common Ruby techniques. These tips are probably most useful for programmers that come to Ruby from another language or haven't experienced the joy of Ruby Metaprogramming yet.

1. Use the singleton-class

Many ways of manipulating single objects are based on manipulations on the singleton class and having this available will make metaprogramming easier. The classic way to get at the singleton class is to execute something like this:
 sclass = (class << self; self; end)
RCR231 proposes the method Kernel#singleton_class with this definition:
 module Kernel
def singleton_class
class << self; self; end
end
end
I will use this method in some of the next tips.

2. Write DSL's using class-methods that rewrite subclasses

When you want to create a DSL for defining information about classes, the most common trouble is how to represent the information so that other parts of the framework can use them. Take this example where I define an ActiveRecord model object:
 class Product < ActiveRecord::Base
set_table_name 'produce'
end
In this case, the interesting call is set_table_name. How does that work? Well, there is a small amount of magic involved. One way to do it would be like this:
 module ActiveRecord
class Base
def self.set_table_name name
define_attr_method :table_name, name
end

def self.define_attr_method(name, value)
singleton_class.send :alias_method, "original_#{name}", name
singleton_class.class_eval do
define_method(name) do
value
end
end
end
end
end
What's interesting here is the define_attr_method. In this case we need to get at the singleton-class for the Product class, but we do not want to modify ActiveRecord::Base. By using singleton_class we can achieve this. We have to use send to alias the original method since alias_method is private. Then we just define a new accessor which returns the value. If ActiveRecord wants the table name for a specific class, it can just call the accessor on the class. This way of dynamically creating methods and accessors on the singleton-class is very common, and especially so in Rails.

3. Create classes and modules dynamically

Ruby allows you to create and modify classes and modules dynamically. You can do almost anything you would like on any class or module that isn't frozen. This is very useful in certain places. The Struct class is probably the best example, where
 PersonVO = Struct.new(:name, :phone, :email)
p1 = PersonVO.new(:name => "Ola Bini")
will create a new class, assign this to the name PersonVO and then go ahead and create an instance of this class. Creating a new class from scratch and defining a new method on it is as simple as this:
 c = Class.new
c.class_eval do
define_method :foo do
puts "Hello World"
end
end

c.new.foo # => "Hello World"
Apart from Struct, examples of creating classes on the fly can be found in SOAP4R and Camping. Camping is especially interesting, since it has methods that creates these classes, and you are supposed to inherit your controllers and views from these classes. Much of the interesting functionality in Camping is actually achieved in this way. From the unabridged version:
 def R(*urls); Class.new(R) { meta_def(:urls) { urls } }; end
This makes it possible for you to create controllers like this:
 class View < R '/view/(\d+)'
def get post_id
end
end
You can also create modules in this way, and include them in classes dynamically.

4. Use method_missing to do interesting things

Apart from blocks, method_missing is probably the most powerful feature of Ruby. It's also one that is easy to abuse. Much code can be extremely simplified by good use of method_missing. Some things can be done that aren't even possible without. A good example (also from Camping), is an extension to Hash:
 class Hash
def method_missing(m,*a)
if m.to_s =~ /=$/
self[$`] = a[0]
elsif a.empty?
self[m]
else
raise NoMethodError, "#{m}"
end
end
end
This code makes it possible to use a hash like this:
 x = {'abc' => 123}
x.abc # => 123
x.foo = :baz
x # => {'abc' => 123, 'foo' => :baz}
As you see, if someone calls a method that doesn't exist on hash, it will be searched for in the internal collection. If the method name ends with an =, a value will be set with the key of the method name excluding the equal sign.

Another nice method_missing technique can be found in Markaby. The code I'm referring to makes it possible to emit any XHTML tags possible, with CSS classes added into it. This code:
 body do
h1.header 'Blog'
div.content do
'Hellu'
end
end
will emit this XML:
  <body>
<h1 class="header">Blog</h1>
<div class="content">
Hellu
</div>
</body>
Most of this functionality, especially the CSS class names is created by having a method_missing that sets attributes on self, then returning self again.

5. Dispatch on method-patterns

This is an easy way to achieve extensibility in ways you can't anticipate. For example, I recently created a small framework for validation. The central Validator class will find all methods in self that begin with check_ and call this method, making it very easy to add new checks: just add a new method to the class, or to one instance.
 methods.grep /^check_/ do |m|
self.send m
end

This is really easy, and incredibly powerful. Just look at Test::Unit which uses this method all over the place.

6. Replacing methods

Sometimes a method implementation just doesn't do what you want. Or maybe it only does half of it. The standard Object Oriented Way (tm) is to subclass and override, and then call super. This only works if you have control over the object instantiation for the class in question. This is often not the case, and then subclassing is worthless. To achieve the same functionality, alias the old method and add a new method-definition that calls the old method. Make sure that the previous methods pre- and postconditions are preserved.
 class String
alias_method :original_reverse, :reverse

def reverse
puts "reversing, please wait..."
original_reverse
end
end

Also, a twist on this technique is to temporarily alias a method, then returning it to before. For example, you could do something like this:
 def trace(*mths)
add_tracing(*mths) # aliases the methods named, adding tracing
yield
remove_tracing(*mths) # removes the tracing aliases
end
This example shows a typical way one could code the add_tracing and remove_tracing methods. It depends on singleton_class being available, as per tip #1:
 class Object
def add_tracing(*mths)
mths.each do |m|
singleton_class.send :alias_method, "traced_#{m}", m
singleton_class.send :define_method, m do |*args|
$stderr.puts "before #{m}(#{args.inspect})"
ret = self.send("traced_#{m}", *args)
$stderr.puts "after #{m} - #{ret.inspect}"
ret
end
end
end

def remove_tracing(*mths)
mths.each do |m|
singleton_class.send :alias_method, m, "traced_#{m}"
end
end
end

"abc".add_tracing :reverse
If these methods were added to Module (with a slightly different implementation; see if you can get it working!), you could also add and remove tracing on classes instead of instances.

7. Use NilClass to implement the Introduce Null Object refactoring

In Fowlers Refactorings, the refactoring called Introduce Null Object is for situations where an object could either contain an object, or null, and if it's null it will have a predefined value. A typical exampel would be this:
 name = x.nil? ? "default name" : x.name
Now, the refactoring is based on Java, which is why it recommends to create a subclass of the object in question, that gets set when it should have been null. For example, a NullPerson object will inherit Person, and override name to always return the "default name" string. But, in Ruby we have open classes, which means you can do this:
 def nil.name; "default name"; end
x # => nil
name = x.name # => "default name"
8. Learn the different versions of eval

There are several versions of evaluation primitives in Ruby, and it's important to know the difference between them, and when to use which. The available contestants are eval, instance_eval, module_eval and class_eval. First, class_eval is an alias for module_eval. Second, there's some differences between eval and the others. Most important, eval only takes a string to evaluate, while the other can evaluate a block instead. That means that eval should be your absolutely last way to do anything. It has it's uses but mostly you can get away with just evaluating blocks with instance_eval and module_eval.

Eval will evaluate the string in the current environment, or, if a binding is provided in that environment. (See tip #11).

Instance_eval will evaluate the string or the block in the context of the reveiver. Specifically, this means that self will be set to the receiver while evaluating.

Module_eval will evaluate the string or the block in the context of the module it is called on. This sees much use for defining new methods on modules or singleton classes. The main difference between instance_eval and module_eval lies in where the methods defined will be put. If you use String.instance_eval and do a def foo inside, this will be available as String.foo, but if you do the same thing with module_eval you'll get String.new.foo instead.

Module_eval is almost always what you want. Avoid eval like the plague. Follow these simple rules and you'll be OK.

9. Introspect on instance variables

A trick that Rails uses to make instance variables from the controller available in the view is to introspect on an objects instance variables. This is a grave violation of encapsulation, of course, but can be really handy sometimes. It's easy to do with instance_variables, instance_variable_get and instance_variable_set. To copy all instance_variables from one object to another, you could do it like this:
 from.instance_variables.each do |v|
to.instance_variable_set v, from.instance_variable_get(v)
end
10. Create Procs from blocks and send them around

Materializing a Proc and saving this in variables and sending it around makes many API's very easy to use. This is one of the ways Markaby uses to manage those CSS class definitions. As the pick-axe details, it's easy to turn a block into a Proc:
 def create_proc(&p); p; end
create_proc do
puts "hello"
end # => #<Proc ...>
Calling it is as easy:
 p.call(*args)
If you want to use the proc for defining methods, you should use lambda to create it, so return and break will behave the way you expect:
 p = lambda { puts "hoho"; return 1 }
define_method(:a, &p)
Remember that method_missing will provide a block if one is given:
 def method_missing(name, *args, &block)
block.call(*args) if block_given?
end

thismethoddoesntexist("abc","cde") do |*args|
p args
end # => ["abc","cde"]
11. Use binding to control your evaluations

If you do feel the need to really use eval, you should know that you can control what variables are available when doing this. Use the Kernel-method binding to get the Binding-object at the current point. An example:
 def get_b; binding; end
foo = 13
eval("puts foo",get_b) # => NameError: undefined local variable or method `foo' for main:Object
This technique is used in ERb and Rails, among others, to set which instance variables are available. As an example:
 class Holder
def get_b; binding; end
end

h = Holder.new
h.instance_variable_set "@foo", 25
eval("@foo",h.get_b)


Hopefully, some of these tips and techniques have clarified metaprogramming for you. I don't claim to be an expert on either Ruby or Metaprogramming. These are just my humble thoughts on the matter.

27 kommentarer:

Anonym sa...

Interesting material for a newcomer to metaprogramming. Thanks for posting it. I especially like defining methods on NilClass -- very cute.

A couple of minor points:

The link for "RCR231" currently goes to RCR213 "Extended Access to the DATA Pseudo-File", which confused me a little until I worked it out.

"In this case, the interesting call is has_many" -- should this be set_table_name?

Ola Bini sa...

Thanks for your corrections, they have been added!

Dave Hoover sa...

I have been thinking much about Metaprogramming lately. I have come to the conclusion that I would like to see more examples and explanations of these techniques.

Have a look at Jay Fields Thoughts. Great post!

Anonym sa...

Thanks Ola. I've found some great meta programming nuggets scattered in blogs, mailing lists etc. but seeing a whole bunch of them together is a treat. I hope you'll continue to add to these or elaborate existing ones in future.

Anonym sa...

I think some Ruby metaprogramming snippets can also be found here: http://bigbold.com/snippets/ !

Anonym sa...

Ola,

This is an interesting and great post. Could you elaborate more on the trace example? I can't see how the add_tracing and remove_tracing would actually be implemented even though I know what they *should* do.

Unknown sa...

Do you sleep? :) Keep up the posts, your blog is one of the most visited Ruby/Rails blogs in my collection.

Anonym sa...

Hey Ola,

really great stuff.

Still trying to get my head around the metaprogramming stuff and learned a lot from your post.

Regarding 2. why do you use the singleton class? Why can't you just define table_name on Product? Because you don't have a reference to the Product class inside of define_attribute_method?

Cheers,
Mariano

Ola Bini sa...

Mariano!

Well, a few reasons. First, I wanted to show the technique of DSL's. Having the ActiveRecord user write
def self.table_name
"products"
end
instead of
set_table_name "products"
makes quite a difference for the readability.

The base class doesn't really know about the Product class, though, and in the generic case the singleton class is the only safe place for such a method.

Mariano sa...

>>Having the ActiveRecord user write
[..] instead of
set_table_name "products"
makes quite a difference for the readability.

Yeah, sure got that. I was more thinking along the lines, why you alias the method in the singleton class of Product, rather than defining it on Product itself.

And why would you alias the existing method? It is of no use anymore, is it?

Btw. I am not trying to be a smartass, I am just trying to understand it.

Here is my line of thinking:

module ActiveRecord
class Base

def self.set_table_name(name)
self.class_eval "def self.table_name; \"#{name}\"; end"
end
end
end

class Product < ActiveRecord::Base; set_table_name "produce"; end

puts Product.table_name # => "produce"

The interface for the Product developer doesn't change, s/he can still use set_table_name, but I believe now the class Product has been extended instead of the singleton.

And that is my original question.

Mariano sa...

Btw. Any idea what method to use instead of the ugly class_eval here?

Ola Bini sa...

Hi,

Yes, that method would work in that case, but notice that you have no syntax safety at all in what you're doing here. If you instead use singleton_class to get at the metaclass, and use define_method on the singleton_class, you would get rid of the class_eval. (You do know that you could send a block to class_eval too?)

So, the short answer, the singleton class is necessary to make what we're doing more explicit.

Mariano sa...

Ola, thanks for taking the time to clear that up.

I am happy now ;-)

Anonym sa...

Ola,

Thanks, that's just what I was looking for. This is a really great resource on ruby metaprogramming and I hope you'll continue to discuss this in your blog and to add examples as you go along.

Anonym sa...

If you're going to replace methods, consider capturing the old method in a closure in order to keep it, rather than aliasing it and risking collision and leaving behind a mess.

-Brian

monkeeboi sa...

Inersting post I'm still trying digest the whole singleton class bit. Will probably take a few more reads.

I'm a bit rusty on my ruby but I think you might need to do this or the traced methods might not get cleaned up if the inner block does a return or a break.

def trace(*mths)
...add_tracing(*mths)
...begin
......yield
...ensure
......remove_tracing(*mths)
...end
end

Anonym sa...

theres a small bug in the Hash / method_missing code example, self[m] should be self[m.to_s].

Anonym sa...

Hi,
i submit story to digg
If you like the post, please add your diggs
Ruby Metaprogramming techniques
http://www.digg.com/programming/Top_11_Ruby_Metaprogramming_Techniques

Mov 0 To 1 sa...

You should also use the extra two arguments to eval calls: file and line number. This will allow you to get useful debug information when something breaks inside your meta-program. see my article on Meta Programming and Stack Traces http://ghouston.blogspot.com/2006/06/ruby-meta-programming-and-stack-traces.html

chunter sa...

Nice!

I actually use a couple of the techniques already, and I can't wait to try the rest.

Anonym sa...

is there a javsscript equivalent of each of these techniques ?

Michele sa...

The hash method doesn't work for string keys, only for symbol keys.

a = {:name => "arne anka", "drinks" => "beer"}

p a.name # -> "arne anka"
p a.drinks # -> nil

Changing self[m] to self[m.to_s] reverses the behaviour.

Michele sa...

I meant that the behaviour is switched when changing self[m] to self[m.to_s], not reversed.

Anonym sa...

Thank you very much for this inspiration!

Unknown sa...

Hi, great post. A small comment for point 4 on method_missing. The code you used here when use try to call x.foo will return nil.

To return :baz (the value in the hash), the code should be:

class Hash
def method_missing(m,*a)
if m.to_s =~ /=$/
self[$`] = a[0]
elsif a.empty?
self[m.to_s]
else
raise NoMethodError, "#{m}"
end
end
end

Unknown sa...

"If you use String.instance_eval and do a def foo inside, this will be available as String.foo, but if you do the same thing with module_eval you'll get String.new.foo instead."

No matter how hard I try, I can't reproduce this behavior -- i.e. I can't find a way to define a String.foo using instance_eval (or class_eval, for that matter). What am I missing?

Unknown sa...

Solved my own problem -- I can reproduce it if I use "def" (as you wrote), but I was using define_method. It seems the only way to use define_method to dynamically define methods is via the singleton class.