lördag, september 23, 2006

Three ways to add Ruby Macros

As most of my readers probably have realized at this point, I have a few obsessions. Lisp and Ruby happens to be two of the more prominent ones. And regarding Lisp, macros is what especially interest me. I have been doing much thinking lately on how you could go about adding some kind of macro facility to Ruby and these three options are the result.

I should begin by saying that none of these options are entirely practical right now. All of them have some serious problems which I frankly haven't been able to come up with an answer for yet. But that doesn't stop me from blogging about my ideas, of course. Another thing to notice is that this is not about hygienic macros. This is the full-blown, power, blow-the-moon away version of macros.

MacRuby - Direct defmacro in Ruby
The first approach rests on modifying the language itself. You can add a defmacro keyword which takes a name and a code block to execute. Each time the compiler/interpreter finds a macro-definition, it will remember the name. When that name is found in the code later on each place will be marked. Then, before execution begins, all places where the call to the macro are will be replaced by the output from sending in the subnodes at that place by the output of calling the macro. An example of a simple macro:
 defmacro log logger, level, *messages
:call, logger, level, *messages

log @l, :debug, "value is: #{very_expensive_operation()}"
What's interesting in this case is that the messages will not be evaluated if the $DEBUG flag is not set. This is because the value returned from the macro will be spliced into the AST only if that flag is set. Otherwise a no-op will be inserted instead. Obviously, for this kind of code to work, the interpreter would need to change substantially. There is also a big problem with it, since it's very hard to fit this model into the object-oriented system of Ruby. As I think about it now, it seems macros would be the only non-OOP feature in Ruby, if added in this way. Another big problem with this model is that it is really not that intuitive what the resulting code from the macro will be. As soon as something more advanced needs to be returned, it will be very hard getting it straight in your head. One solution to this would be to do it the standard CL way. First write the output from the macro in several different instances. Then transform this to the AST code through a tool that parses the code. Then transform this into the macro. This process would be helped by tools, of course.

Back-and-Lisp-Ruby - Write macros in Lisp, translate Ruby back and forth

Another way to achieve this power in Ruby would be to separate the macro language from the main language. In effect, the macros would be a classic pre-processor. To offer the same power level as Lisp and others, the best way would be to write the macros themselves in a Lisp dialect, then transform Ruby in a well-defined way to Lisp and back again. (See the next version for more about this idea.) In this situation the same macro as before could look like this:
 (defmacro log (logger level &rest messages)
(if $DEBUG
`(,level ,logger ,@messages)
The main difference in this code is that the macro and the output from the macro is Lisp. We have gotten rid of the ugly :call and :nop return values, and to me this seems quite readable. Of course, I'm not sure everyone else feels the same way. And we still have the same problem with Object Orientedness. It's missing.

RoCL - Ruby over Common Lisp
The final idea is to build a Ruby runtime within Common Lisp and transform Ruby into Common Lisp before running it. The macros could either be added as Ruby code or Lisp code. Everything will be transformed into the equivalent code in Lisp, maybe using CLOS as the Object-system, or building something based on Ruby's. Of course, the semantics of many things would change, and many libraries would need to rewritten. But in the end, there would be incredible power available. Especially if we can make it go both ways, so that Common Lisp can use Ruby libraries.

An example transformation could look like this. From this Ruby:
 class String
def revert(a, *args)
if block_given?
yield a
args + [a]

"abc".revert "one" do |x|
puts x
This is nonsense code, if you hadn't noticed. =)
 (with-class "String" nil
(def revert (a block &rest args)
(if block
(apply block a)
(+ args [a]))))
(revert "abc" "one" #'(lambda (x)
(puts self x)))

It is very hard to actually retrofit macros into Ruby after the fact. I'm still not sure it can be done and keep enough of Ruby's semantics to make it meaningful. It seems that we need a new language. But if I had to choose among these approach, the RoCL one seems the most interesting and also the most fun to implement. If I have a motto it would have to be something in the line of "best of all worlds". I want the best from Ruby, Java, Lisp, Erlang and everything I can find.

17 kommentarer:

Tomasz Wegrzanowski sa...

There is already a perfect way of using macros with Ruby - RLisp.

Take a look at HTTP server example, where macros are used with Webrick.

John sa...

Another way would be to create a language that is semantically Ruby but is expressed in sexprs that can be read by a lisp reader.

Antonio Leitao has done this sort of thing for Java, with Linj ("Linj Is Not Java"). Linj has lisp syntax, but it is actually Java + macros. It compiles to syntactically and idiomatically correct Java code. There is also a project to make it possible to translate from Java back to Linj (and also from Linj to Common Lisp and vice versa). Along the same lines, there is a sexpr'd version of javascript called Parenscript.

What all of this boils down to is that virtually any language (at least, if it has garbage-collection) can be given a sexpr'd syntax without altering its semantics. A given semantics can have an algol-like version (call it a "smeagol") and a sexpr'd version (call it a "gollum"). Creating bidirectional translations between the smeagol and the gollum of a dynamically typed language should be easier to do than was the case for statically typed Java/Linj.

Consider, too, that translation between gollums of different languages is relatively easy, especially if their type systems are similar. That being so: if you had smeagol-gollum pairs for two similar languages (Ruby & Smalltalk, say), you might even be able (with much work) to chain the translations. Roughly, something along the lines of: Ruby <-> Linrb <-> Linst <-> Smalltalk.

This last possibility is rather farfetched. However, the idea of creating a sexpr'd version of Ruby complete with Lisp reader and syntactical macros is not so farfetched.

Linj is not open source, but there is a manual and tutorial on the Linj download page; the source for parenscript is readily available.

Laurence Tratt sa...

I has a paper published last year which shows how to integrate a modern macro system into a dynamically typed language - in a sense it takes Template Haskell style macros into a Python language. It offers a fourth possibility for your list :)

Ola Bini sa...

Tomasz, yes, I do remeber RLisp, but that's not entirely what I'm looking for. The big problem is that there is no way to change the Ruby syntax with it, which was the main point of my post. All three alternatives have ways to make it possible to write a log-macro that doesn't evaluate it's parameters unless the debug-flag is set.

Further, RLisp is nice, but very confusing for an old lisper. For example, the way RLisp uses let is just wrong. Using defsyntax for unhygienic macros is also a mistake in my book.

Ola Bini sa...

John, the "Linj"-approach is actually more or less the second approach from my post; translating Ruby to a well-defined Lisp-syntax and then back again.

Ola Bini sa...


A very interesting approach, but correct me if I'm wrong here; I used to think template-based macro systems where limited in power compared to "dangerous" CL-style macros?

Laurence Tratt sa...

> correct me if I'm wrong here; I
> used to think template-based macro
> systems where limited in power
> compared to "dangerous" CL-style
> macros?

There's no difference in power; you can do everything in a Converge-style approach that you can do in CL-style. There are ways of building completely arbitrary abstract syntax trees if you need to. The property that I think a Converge-style approach has, is that simple things are relatively simple and safe; but there are plenty of ways to blow your foot off, if that's your thing.

Ola Bini sa...

Laurence: Yeah, I like that feature in a Macro system. I don't feel like a man if I can't blow myself to pieces if I want to. =)

Josh sa...

You can do open-ended macros by giving the interpreter access to the compiler input stream. There's an fringe language called Trans (www.transmuter.org) that does that.

Chris sa...

Hey, what are macros anyway?

Johannes Rudolph sa...

> Chris said...
> Hey, what are macros anyway?

I think that really is the question. What are macros in Ruby and why would you need them?

Macros in lisp are lisp functions which transform an S-expression (I'll talk about them later-on) into another S-exp (which is often a valid lisp form that a lisp interpreter can understand and evaluate) The transformation can be done before runtime.

Why is that possible? It's because of the language the Lisp-language is written in (its 'meta-language'). And that is S-expressions. S-expressions are only literals and lists of literals or lists. This language is so basic, you could express everything in it. The possibility to write macros stems from the fact that lisp programmers are used to express and understand programs written in such an abstract way.
(Perhaps you could put it the other way round: 'A lisp programmer is someone who is used to transform written language into a program by structuring it by putting some brackets around the right regions' ;))

The point is: S-exp is abstract. There is no lisp in s-exp yet. But: Every data used in a lisp program is given in s-exps as well.

Let's look at Ruby. Ruby's meta-language consists of about four basic elements.
* class/module-definitions
* function definition
* all sorts of expressions
* exception handling etc
(* blocks)
Consider you would see only those five points. Guess which language we are talking about... Right. Ruby.
So Ruby's meta-language is not abstract. Not at all. It is the language to write ruby-programs. But: A ruby program is not necessarily the best tool to write valid ruby code (on the AST level).

I think, you can try as hard as you can, but you will never come close to the macro possibilities in lisp because in no other language the data has the same format as the language itself.

What is the purpose of ruby macros?
1.) generate code
2.) Give ruby syntax (eg expressed in AST) a different meaning in a specific context, ie transform a ruby block into another.

The first point is easy: Use eval. There are two problems: Eval with string parameter is not syntax checked, but you can avoid this by not using it (but module_eval and procs). The second point is that there is no lispy backquoting/comma syntax for inserting special pre-evaluated code. We should think about something like that. (and hack it into jruby perhaps ;))

Point two is easy as well: We need programmatic access to the code(AST) of a block or code literals. That should be possible as well. See LINQ for how Microsoft is planning to support that in C#. (http://msdn.microsoft.com/data/ref/linq/)

Laurence Tratt sa...

> The possibility to write macros stems from
> the fact that lisp programmers are used to
> express and understand programs written in
> such an abstract way.

This is an extremely common, but totally incorrect assumption. Macros are easy in LISP because of its minimal syntax. That doesn't mean that macros in other languages are impossible. This assumption has crippled thinking for several decades now, and has scuppered many past attempts to add macros to modern languages.

> I think, you can try as hard as you can,
> but you will never come close to the macro
> possibilities in lisp because in no other
> language the data has the same format as
> the language itself.

I would strongly suggest you look at Template Haskell and perhaps my Converge language (which takes many of its macro related ideas from TH). These show that it is indeed possible to have a full, powerful, practical macro system in a modern programming language with a rich syntax.

Johannes Rudolph sa...

> This is an extremely common, but
> totally incorrect assumption.
> Macros are easy in LISP because of
> its minimal syntax. That doesn't
> mean that macros in other languages
> are impossible. This assumption has
> crippled thinking for several
> decades now, and has scuppered many
> past attempts to add macros to
> modern languages.
You are right. My remark was presumptuous but it probably said not what I wanted to express. Perhaps you could say it that way: For macro writing you need to operate on abstract syntax. If your code has the same form as normal data it's the best. You can use the same functions to operate on data and on code. That's lisp. (and nothing new)
In every other language you need another form of "code literals".

Perhaps we can make jruby do what we want without hacking the java code base... Hm...

require 'java'

class Object
def jfield(field)
jobj=Java::ruby_to_java self

def lit(&block)


lit {
class Hello
def world

you get the syntax tree for the block. Wow, that's addicting...

taw sa...

Ola Bini: You have a point about RLisp using defsyntax for CL-like macros. There was absolutely no reason for it, I was just reading R5RS while coding RLisp and it somehow affected my brainwaves. It was silly, I'll fix it.

But I disagree about let (it really surprises me that this is the most criticism of RLisp, and nobody minds lists-as-arrays that are far more radical). RLisp let was an experiment. I haven't written any big Lisp programs but I wrote a lot of OCaml code (and quite a bit of SML and Haskell), and they all have Scheme-like let, and it often sucks. Paul Graham doesn't like let either.

So I replaced it by something Ruby/Python-like.

I think the experiment was successful. The code looks a bit nicer, and RLisp let can easily be used to implement Scheme-like let (local-let macro in RLisp does exactly that). On the other hand (and that's the strongest reason I have for keeping RLisp let), it is pretty much impossible to implement RLisp-style let using Scheme-style let. I asked people to come up with macros to do that and the best they were able to were some ugly and fragile code-walkers.

It's of course possible that better semantics of let do exist, or that it should be called something else (you can probably rename it with some macros), but I'm strongly convinced that it's better than just using Scheme-like let.

Stephen Viles sa...

From Paul Graham's Arc Lessons page (link in taw's comment above):

"Macros and implicit local variables just don't seem to work well together. Meaning that any language that already has implicit local variables will run into trouble if they try to add macros."

taw sa...

Stephen Viles: Design of implicit local variables that Paul Graham used in Arc was significantly different from one in RLisp. Scope in Arc was introduced implicitly by (do ...) and could be suspended explicitly by (justdo ...), in RLisp you need explicit (local ...) for that, implicit (do ...) doesn't create a new scope.

According to Paul Graham the main problem was accidentally introducing scope in macros, at least this doesn't happen in RLisp. I don't know if it's enough to avoid the problem, but it's worth a try.

JeanHuguesRobert sa...

As a side note, the DEBUG example is interesting in itself.

In C it is very easy to define a macro that expand to nothing in non debug mode.

To achieve a similar effect in Ruby I explored 3 solutions.

I use tracing to debug.
debug "debugging info"

Q) How can I remove the overhead while in non debug mode?

A1) Use blocks to pass parameter.
debug { "debugging info" }
Cost: a method call with a static argument.

A2) Rewrite source code at load time. In Ruby one can intercept/redefine the methods like "require()" that loads the source code. If debug statement are one liners, a regexp replace is all it take to remove them.
Cost: Increased load time.

A3) de&&bug( "debugging info") trick.
Cost: boolean conditional test.
See http://virteal.com/DebugDarling

I tend to use the later because it works in many languages.s