ClojureQL - 1.0.0 now in beta

2010-11-18 01:50:40

After 2.5 years ClojureQL got shredded and rebuilt in less than 3 weeks, today Im releasing 1.0 public beta! In this post I'll show you the basics of how ClojureQL lets you compose your queries using powerful abstractions in a short screencast.

 

Preface

Over 2.5 years ago I set out to develop THE database abstraction library and after 2.5 years I felt painted into a corner, so I deleted the whole thing and re-wrote it in less than 3 weeks (with a little help from my friends).

The ClojureQL that Im presenting today is the public 1.0 beta release and hopefully it will meet all of your production needs. I've prepared this small screencast to get you started and I'll discuss the details below - If the quality of the video is unsatisfactory please download the original file from Vimeo instead of using the Flash player:

 

 

ClojureQL (Beta) from Lau Jensen on Vimeo.

 

ClojureQL 1.0 - Bringing Relational Algebra, back to Relational Databases

ClojureQL 1.0 is has been a long time coming, but I feel confident that the interfaces and primitives of this implementation are whats needed to provide the ultimate SQL integration experience. ClojureQL is similar in intent to Rubys Arel. The current implementation is the refined experience of the previous 2.5 years boiled down into about 700 lines of pure Clojure.

If you want the finer details of why the first version of ClojureQL got scrapped, read this post. As I was designing this re-write I was approached by Justin Balthrop (ninjudd), of Cake fame, who said "What you've got is 90% of the way towards getting true composability, relational algebra (RA) primitives will get you all the way". I read up on RA and looked at some examples and knew that Justin was on to something, RA was definitely the key to making a truly composable library (well, that and the lack of macros), so off we went. The ClojureQL you've just seen demonstrated is totally RA based.

 

Getting started

To get ClojureQL either include these artifacts in your project:

Cake / Lein:

[clojureql "1.0.0-beta1-SNAPSHOT"]

Maven:   

<dependency>
  <groupId>clojureql</groupId>
  <artifactId>clojureql</artifactId>
  <version>1.0.0-beta1-SNAPSHOT</version>
</dependency>

Or simply clone the Github Repo.

git clone git@github.com:LauJensen/clojureql.git

If you run into trouble that you think you yourself might be the cause of, you can get help on #clojureql on irc.freenode.net where you'll find both myself and Justin. If you've uncovered an issue with ClojureQL itself, please report it on Github and include instructions on how to reproduce.

 

Documentation

I'll outline the various functions/interfaces that you need to know below, but generally you should refer to the Github page which will serve as documentation henceforth!

 

Shadows

In a few instances ClojureQL shadows functions from clojure.core and in two of these instances they behave different:

Besides these 2 we shadow take, drop and sort but these all work as you would expect. If you call them on a Table they return a new table, if you call them on a collection they pass your arguments on to the clojure.core version. No mathematical operators or and/or are shadowed. You can make your code clearer by using ClojureQL :as cql.

 

The interfaces

There are only a few functions that you need to get comfortable with in order to start composing powerful queries:

Table

Table lets you construct a reference to a table in a SQL database. As its arguments you can specificy the connection-object (keyword or hashmap, required), the table name (keyword or hash-map if you're aliasing).

Deref

When you dereference a table object (either (deref table) or @table) the query of that table is compiled and executed.

Select / Where

The select statement is not to be confused with SQL SELECT command. Select actually corresponds to WHERE from MySql. There are 2 ways in which you can build a select statement.

1) Using the 'where' macro:

clojureql.core> (where (and (> :id 5) (<= :wage 100)))
"((id > 5) AND (wage <= 100))"

Which accepts column references as keywords, standard mathematical operators and and/or. Or you can:

2) Write out the datastructure which 'where' compiles manually by suffixing every function with a star* (this is the only thing which 'where' does for you). To do that, you must first use/require clojureql.predicates:

clojureql.core> (and* (>* :id 5) (<=* :wage 100))
"((id > 5) AND (wage <= 100))"

Rename

Rename is helpful when you need to alias a column after table construction. Here are 2 variants of renaming, both valid:

clojureql.core> (-> (table nil :users)
                    (project [:id :name])
                    (rename {:id :idx})
                    to-sql)
"SELECT users.id,users.name FROM users AS users(idx,name)"
clojureql.core> (-> (table nil :users)
                    (project [[:id :as :idx] :name])
                    to-sql)
"SELECT users.id AS idx,users.name FROM users"

Note that a nested vector always denotes aliasing as does hashmaps. To rename a table, do the following:

clojureql.core> (-> (table nil {:users :u1})
                    to-sql)
"SELECT * FROM users u1"

Join

Join compiles to a regular join using either USING or ON depending on your predicate. As arguments it takes 2 tables and a predicate. If your predicate is a keyword it will compile to USING(keyword), if its a predicate built with the 'where' macro, it will be taken literally:

clojureql.core> (-> (join (-> (table nil :t1) (project [:c1]))
                          (-> (table nil :t2) (project [:c2]))
                          :id)
                    to-sql)
                    
"SELECT t1.c1,t2.c2 FROM t1 JOIN t2 USING(id)"

And here's the equivalent join with a custom predicate

clojureql.core> (-> (join (-> (table nil :t1) (project [:id :c1]))
                          (-> (table nil :t2) (project [:id :c2]))
                          (where (= :t1.id :t2.id)))
                    to-sql)
                    
"SELECT t1.id,t1.c1,t2.id,t2.c2 FROM t1 JOIN t2 ON (t1.id = t2.id)"

It's also a possibility to join a table unto itself using aliasing, like this next example where we find an employees manager:

clojureql.core> (-> (select (table {} {:employees :p})
                            (where (= :name "John")))
                    (join (table {} {:employees :b})
                          (where (= :p.manager :b.id)))
                    to-sql)
"SELECT p.*,b.* FROM employees p JOIN employees b ON (p.manager = b.id) WHERE (name = 'John')"

 

Outer-join

Outer-join does pretty much as advertised, it compiles to an outer-join either LEFT, RIGHT or FULL. It takes as its arguments 2 tables, the type of join as a keyword (:left|:right|:full) and a predicate (again, either keyword or clause):

clojureql.core> (-> (outer-join (table nil :t1)
                                (table nil :t2)
                                :left
                                (where (= :t1.id :t2.id)))
                    to-sql)
                    
"SELECT t1.*,t2.* FROM t1 LEFT OUTER JOIN t2 ON (t1.id = t2.id)"

Aggregate

Aggregate lets you select aggregates from your tables in a composable fashion (they dont need to be defined up front). It simply takes a table and a collection of aggregates. All aggregates in ClojureQL follow this syntax for singular fields: :function/field and this for multiple fields: :function/field1:field2:fieldn. If you need to get even fancier, use strings they are supported literally:

 

clojureql.core> (-> (table nil :users)
                    (aggregate [:count/* :avg/salary])
                    to-sql)
"SELECT count(*),avg(users.salary) FROM users"
clojureql.core> (-> (table nil :users)
                    (aggregate [:count/* :max/income:expenses])
                    to-sql)
"SELECT count(*),max(users.income,users.expenses) FROM users"
clojureql.core> (-> (table nil :users)
                    (aggregate ["corr(x + y)"])
                    to-sql)
"SELECT corr(x + y) FROM users"

Take/Drop

Just as with your regular collections in Clojure, you can use take/drop on tables and they act exactly the same way:

clojureql.core> (-> (table nil :users)
                    (take 4)
                    (drop 2)
                    to-sql)
"SELECT users.* FROM users   LIMIT 2,2"

Are you wondering why it says LIMIT 2,2 and not 2,4? Well consider this:

clojureql.core> (->> (range 10) (take 4) (drop 2))
(2 3)

ClojureQL handles sequences exactly as in Clojure (btw. that all works in the same namespace).

Sort

Sort takes a table, a column to ORDER BY and a direction (keyword, :asc|:desc):

clojureql.core> (-> (table nil :users)
                    (sort [:id#asc])
                    to-sql)
"SELECT users.* FROM users ORDER BY users.id ASC"

Conj!

Conj! inserts a record or records into the table. A record is simply a hashmap and you can put multiple hashmaps inside a vector:

clojureql.core> @(-> (table db :users)
                     (project [:id])
                     (conj! {:name "Frank"}))
({:id 1} {:id 4} {:id 5} {:id 6} {:id 7} {:id 8} {:id 9} {:id 10} {:id 11} {:id 12})

 

Disj!

disj! removes a rename which matches the supplied predicate:

clojureql.core> @(-> (table db :users)
                     (project [:id])
                     (disj!  (where (= :name "Frank"))))
({:id 1} {:id 4} {:id 5} {:id 6} {:id 7} {:id 8} {:id 9} {:id 10})

Update-In!

update-in! again does as advertised, it finds the row(s) matching the predicate and update the information on them. If no rows match the predicate a new one is created:

clojureql.core> @(-> (table db :users)
                     (project [:id])
                     (update-in! (where (= :name "Frank"))
                                 {:name "John"}))
({:id 1} {:id 4} {:id 5} {:id 6} {:id 7} {:id 8} {:id 9} {:id 10} {:id 13})

With-Results

With-results is an alternative to deref which lets you pass a body that will be lazily evaluated on the open resultset. It takes a table first, then a placeholder for the results, then the body of code: Updated: It takes a binding vector, where you bind results to a table:

clojureql.core> (with-results [rs users]
(doseq [r rs] (prn r)))
{:id 1} {:id 4} {:id 5} {:id 6} {:id 7} {:id 8} {:id 9} {:id 10} nil

If you pass the optional :fetch-size argument to the database object you code will be called on each chunk, allowing you to work with huge datasets.

With-Connection

If you've constructed a table with nil as the connection object, you can simply wrap any database interaction with that table in (with-connection db-obj ..).

 

Final words

I would like to thank both Meikel Brandmeyer and Nicolas Buduroi for helping me learn those tough lessons on the first, now scrapped, version of ClojureQL. I would also like to thank Christophe Grand who served as an inspiration to me in Frankfurt and finally a huge thank you goes out to Justin Balthrop who has fought the good design fight with me on every step of the way towards ClojureQL(2) 1.0!

Also my thanks go out to Chris Houser who helped review the screencast, provide feedback and even contributed a slogan:

"ClojureQL - bringing Relational Algebra back to Relational Databases"

I hope you all take this library out for a spin and send some feedback (good, bad, cruel, performance related, all is accepted) back to me. Thanks and have fun!


About the author:

Lau Jensen is the founder and owner of Best In Class a danish consultancy company which specializes in Clojure development.

Lau is also one of the instructors driving the Conj Labs initiative. If you would like to be notified once new blogposts are published, you can follow Lau on Twitter.

belun
2010-11-18 05:14:16
seeing works of art like this makes me happy i started learning clojure :)

although i don't fully understand the the complexity of clojureql... good job, mate (and his helping elfs) !
Glen Stampoultzis
2010-11-18 06:53:48
Love your work.  This looks really nice.  I was wondering.  What happens when a particular backend doesn't support some of the syntax?  For example I don't think all DB backends support Using.
tonyl
2010-11-18 11:12:07
this is a very interesting api. It will definitely help to construct complex queries and handle results much easier.

I'll take a look at it and use it in a project I have.

I have a question in your renaming the table example. Wouldn't the output query just query both tables, instead of aliasing to the new value? unless that is what you wanted to show with the hashmaps.
Alex Miller
2010-11-19 09:53:10
First, this looks great. From having working on a relational query planner, this matches my model of things very well.  Two things that seemed odd to me:

1) Using deref to get the table contents seemed a little weird.  Doesn't match my notion of deref very closely.

2) In the examples on this page you have project as something executed at the end of the chain, which matches my mental model of projection well.  In the video you often do the project *first* though which I found confusing.  I was curious what your comments were about that.  I haven't dug into the code itself yet.

Lau
2010-11-20 03:55:03
Thanks to all for your encouraging comments!

@Glen: In principle, if a backend doesnt support what you're doing you have 2 options. Use string-arguments where the syntax is not supported, ie. manually write the SQL. Or 2) Extend the compiler with your own multi-method. I'll have a tutorial up on this soon. It takes 2 lines to extend the compiler to handle a new function.

For your specific example about USING, simply supply a 'where' clause instead of a keyword, and CQL will compile without using USING.

@Tony: Its a goal of mine to make CQLs queries as effecient as possible - I dont want people disregarding this lib based on performance concerns. In the above example, I dont think MySQL queries the tables twice but I'll have to check.

@Alex
1) Deref simply means that you're holding an object which is a reference to a value and you want to get at that value. I view a Relational Table object as such, an object pointing to a value in a database.

2) Under the hood project merely modifies a field in the RTable object, so you can put it first, last or in the middle. Semantically it makes no difference, its not additive.