Monday, September 29, 2008

Towelie Has Some Weird Code

I've written some weird code in Towelie. I'm not sure if it's good or not.

Towelie uses ParseTree to parse Ruby code, then analyze the parse tree for repetition.

Say you've got some simple Ruby code.



ParseTree turns that into an array of arrays of nodes.



The following code from Towelie recursively searches such an array for arrays which start with the node :defn. Arrays which begin with that symbol represent method definitions.



The code does a few unusual things.

Firstly, it uses a case with only one when. At some point, the recursive calls to def_nodes will have it looking at a symbol, string, or numerical argument when what it needs is an array. Since the when Array fails, and there is no else, the method terminates and returns accumulator. When you're writing a recursive method, of course, terminating that method means returning back up the chain of recursion from which you began.

I think the case is a nice way to document that the method only parses Arrays, but I'm not sure. It's a legacy from a different way I tried to solve the problem before I found this solution, but it reads well to my eyes. I've been writing a lot of recursive, Haskell-y stuff recently, and functional programming seems to gravitate to these pattern-matching case/when things.

Second, and probably most glaringly, it adds methods to objects instead of making those objects part of a class or using something like DefNode.new(). The reason: less words. I wanted to refactor the code with the least possible effort.

I got the idea from Dave Thomas' metaprogramming screencasts. The methods I add are simple accessors. They give me the ability to do this later on:



I've actually simplified the code here very slightly for readability. This method is comparing the bodies of two def nodes, which is to say, two method definitions, to see by how many elements they differ. The win here is that the accessors make it possible to document what information I'm getting from the method definitions. The original stuff was like

def_node_1[1] if def_node_1[2] == def_node_2[2]

which was just hideous.

Tangent: the diff code uses a method called stepwise. That's just a nested each.



It's probably very similar to Rails' in_groups_of, but I'm not using ActiveSupport for this at the moment. In fact I didn't even think of that when I was writing it. I got the idea from some Flash code I stole from bit101.

Anyway, the last weird thing about def_nodes is that I could probably make it better with inject. However, I tried in the simplest way and everything went kaboom.