metautonomo.us

Confounding URL typists since 2007.

The MetaWhere 2.0 Rewrite

Posted by Ernie on February 18, 2011 at 7:02 am

If you or have been following the rails core mailing list lately, you might have seen mention of the MetaWhere rewrite I’ve been working on. If you haven’t, let me start by linking you to the rewrite branch on GitHub. This is just a quick post explaining why I’m rewriting it, and why you should care.

The Rationale

When I was first writing MetaWhere, I experimented with a lot of interesting features and design choices. Some of them (autojoin) were less reasonable and useful than others, so before we ever hit 1.0, I removed them. However, some of the design decisions that were made to support these features cascaded throughout the library. In the process of trying to be responsive to support tickets, I often found myself applying patches that fixed the issue in the most expedient, rather than elegant, manner. All of this is to say that while MetaWhere in its current state works great for many, and I use it on every project I work on, its source had accumulated significant technical debt, and it’s time to pay it off.

What do I get out of this?

You get a MetaWhere that is:

  • Modern
  • Well-tested
  • Modular
  • Powerful

Modern

MetaWhere 2.0 is being targeted from the start at ActiveRecord 3.1. Assuming that there isn’t too much hackery involved in doing so, it will be backported to work with ActiveRecord 3.0.x, as far back as their joint ARel dependencies will allow. Its gemspec is no longer full of Jeweler crud — it’s a lean and mean bundler gem gemspec. It doesn’t depend on Git submodules for vendored Rails/ARel during testing, instead using the Gemfile with Bundler to load up required dependencies before running RSpec. Speaking of which, MetaWhere 2.0 is being tested with RSpec and Machinist instead of Test::Unit/Shoulda and fixtures. Dropping the fixtures makes tests less brittle, and switching from Test::Unit to RSpec brings me in line with pretty much everyone else these days, making it easier for others to contribute tests. All of this will help it to be…

Well-tested

The existing MetaWhere test suite is… shall we say… suboptimal. 66 tests, 134 assertions, almost all of which reside in one monster file. Separate components, where they existed, were not even unit tested on their own. The whole thing amounted to a giant integration test with ActiveRecord. For the most part, it worked, but there were some glaring holes in test coverage. Contrast that with specs for the rewrite, currently consisting of 484 examples spread over 16 files. There’s some artificial inflation here, because I’m separately testing every single predicate method on every node type which accepts them, but there’s simply more testing going on, period. Certain features in the old MetaWhere were in fact duplicated using BDD, by adding a spec describing the functionality from the old Shoulda tests. Writing tests for MetaWhere 2.0 has been both enjoyable and easy, because MetaWhere 2.0 is…

Modular

This is one of those situations when I think a simple comparison of the source tree will tell a pretty big story. Note that I’m not implying here that breaking the code into multiple files and directories means instant modularity, but I want to show you the source tree before delving into what each part of MetaWhere 2.0 does. Click the image on the right, take in a brief overview of the structural differences in the code (hint: there was pretty much no structure in MetaWhere 1.0), and let’s go through the architecture of the new MetaWhere.

Core Extensions (core_ext)

These are mostly core extensions you know and love (or hate) from the old MetaWhere. Of note, in MetaWhere 2.0, core extensions aren’t loaded by default. This means no methods defined on Symbol anymore, unless you explicitly ask for them. Operator overloads on Symbols are gone entirely. “How will I create all my awesome queries?” you ask? I’m glad you asked. We’ll get to that in a moment.

Nodes (meta_where/nodes)

These are the building blocks of your query. For the most part, they serve as a way to store all of the bits of information needed to construct an eventual ARel node, without actually constructing it yet. This means that we can wait until the last possible moment (in ActiveRecord terms, this is in the Relation#build_arel method) to firm them up against a particular ARel relation (table). Lazily creating these ARel nodes is what allows MetaWhere to intelligently map your conditions to the proper joined association. Of particular interest here are the Stub and KeyPath nodes, which we’ll talk about shortly.

Visitors (meta_where/visitors)

Visitors are responsible for turning MetaWhere nodes into their corresponding ARel attributes or nodes. The PredicateVisitor handles your where and having values. The OrderVisitor and SelectVisitor should hopefully be self-explanatory. In ActiveRecord, these get called during Relation#build_arel to “compile” your MetaWhere nodes into their proper ARel equivalents, taking your joins into consideration. Visitors are present in MetaWhere 1.x, but their design is much better in 2.0, thanks to…

Contexts (meta_where/contexts)

Well, technically, context, at the moment. MetaWhere 1.x tightly coupled its visitors to the ActiveRecord JoinDependency functionality. This worked, but it complicated things, and obviously increased its dependency on ActiveRecord. What was really needed was a way to keep track of whatever context the visitors were using to generate ARel attributes and nodes, and that doesn’t technically have to be a JoinDependency at all. In theory, pulling the context object out into its own object should allow the visitors to work with other ORMs that ARel can generate queries against, with an obvious (and hopefully understandable) bias toward relational databases. In practice, it provides for much cleaner separation of concerns, at the minimum.

Adapters (meta_where/adapters)

Adapters hook the creation and visitation of MetaWhere nodes into an ORM via extension of its existing query methods. In ActiveRecord’s case, this means we override methods like Relation#where to handle the MetaWhere DSL, which brings us to…

DSL (meta_where/dsl.rb)

The preferred way of generating queries with MetaWhere 2.0 will be via passing MetaWhere DSL blocks to query methods. In the case of ActiveRecord, this is seamlessly handled by extending the default where, having, order, select, and joins methods of ActiveRecord::Relation. This is what enables us to turn core extensions off by default, and create Stubs and KeyPaths, which become the building blocks of our queries. They’re what make MetaWhere 2.0 so…

Powerful

Check out these queries, straight from the test console (rake console in the gem’s source directory):

# Simple KeyPath usage (and an == operator for equality testing):
Person.joins(:children => :children).where{children.children.name == 'Jacob'}.to_sql
=> SELECT "people".* FROM "people"
   INNER JOIN "people" "children_people"
     ON "children_people"."parent_id" = "people"."id"
   INNER JOIN "people" "children_people_2"
     ON "children_people_2"."parent_id" = "children_people"."id"
   WHERE "children_people_2"."name" = 'Jacob'
 
# Polymorphic belongs_to associations with a keypath for drilldown
# and the specific ARel method for matching:
Note.joins{{notable(Article) => person}}.
     where{{notable(Article) => person.name.matches('Joe%')}}.to_sql
=> SELECT "notes".* FROM "notes"
   INNER JOIN "articles"
     ON "articles"."id" = "notes"."notable_id"
       AND "notes"."notable_type" = 'Article'
   INNER JOIN "people"
     ON "people"."id" = "articles"."person_id"
   WHERE "people"."name" LIKE 'Joe%'
 
# Combining conditions with OR
Person.joins{articles}.where{(name =~ 'Joe%') | (articles.title =~ 'Hello%')}.to_sql
=> SELECT "people".* FROM "people"
   INNER JOIN "articles"
     ON "articles"."person_id" = "people"."id"
   WHERE (("people"."name" LIKE 'Joe%' OR "articles"."title" LIKE 'Hello%'))

As you might have noticed above, the DSL allows us to use more traditional Ruby operators for conditions:

  • == for equality
  • != for inequality — Ruby 1.9 only, ^ is fallback for Ruby 1.8
  • =~ for matches (SQL LIKE)
  • !~ for does not match (SQL NOT LIKE) — Ruby 1.9 only, no fallback provided for Ruby 1.8
  • > for greater than
  • >= for greater than or equal to
  • < for less than
  • <= for less than or equal to
  • >> for inclusion (SQL IN) — mnemonic: value >> [1,2,3] value is moving into the list
  • << for exclusion (SQL NOT IN) -- mnemonic: value << [1,2,3] value is moving out of the list

We can also do some fun things with normal method calls, such as the polymorphic belongs_to example above, notable(Article) or a SQL function call via max(salary).

All in all, I’m really excited about how things are shaping up. If you haven’t tried it out yet, please do!

  • git clone -b rewrite git://github.com/ernie/meta_where.git
  • cd meta_where && bundle install
  • rake console

Open up spec/support/schema.rb to take a look at the models you can play with. Play around, and let me know what you think!

Filed under Blog
Tagged as , , , ,
You can leave a comment, or trackback from your own site.