Friday, February 29, 2008

Linq to Object vs Linq to SQL and custom predicates

One of the most interesting issues of Linq is the ability to use arbitrarily complicated predicates inside linq expressions. You are also aware of the fact that the there are two equivalent syntax possibilities:

List<int> l = new List<int> { 1, 2, 3, 4, 5 };
 
/* linq-style syntax */
foreach ( int v in
    from i in l
    where i > 2
    orderby -i
    select i )
    Console.Write( v );
 
/* is translated to method invocation sequence */
foreach ( int v in l.Where( i => i > 2 ).OrderBy( i => -i ) )
    Console.Write( v );

No matter what syntax you prefer, sooner or later you will realize that expressing filtering and ordering clauses in an explicit form inside linq expression is not a good way to write reusable code.


What if you actually have defined some filtering/sorting clauses in the business logic of the application and you wish to be able to use them both in application logic and in linq expressions?


Let's write a predicate and try to use it in linq-style syntax and method-invocation-sequence syntax.

static Func<int, bool> NumberIsGreaterThan( int K )
{
    return i => i > K;
}


The latter is easy:

foreach ( int v in l.Where( NumberIsGreaterThan( 2 ) ).OrderBy( i => -i ) )
    Console.Write( v );


The former can be expressed in two different ways, using method invocation inside linq-style expression:

foreach ( int v in
    from i in l.Where( NumberIsGreaterThan( 2 ) )
    orderby -i
    select i )
    Console.Write( v );


or the application of the predicate to the parameter:

foreach ( int v in
    from i in l
    where NumberIsGreaterThan( 2 )(i)
    orderby -i
    select i )
    Console.Write( v );


It is interesting to explain the last example. Well, the NumberIsGreaterThan predicate is of type Func<int,bool> and in the where part of linq expression the compiler expects bool expression which I create by applying the parameter to the predicate.

The bubble breaks when you realize that predicates in the way we express them above do not work in Linq to Sql. You'll just end with the nice exception saying that the "method WhateverYouCallIt has no supported translation to SQL".

Why this is so? You see, there are in fact two different linq approaches.

First one is built into the .NET, works on IEnumerable collections and you know it as Linq to Objects. The other one has few example implementations built into the .NET but rather operates on IQueryable collections.

In the latter case, instead of executing a sequence of methods, the compiler rather builds the expression tree for the whole linq expression and passes the tree to the IQueryable implementation saying: "do whatever you like with this expression tree, it's up to you".

(one specific implementation known as Linq to SQL just traverses the expression tree and builds a sql clause that correspond to the tree)

What's the relation of this to our example with custom predicates in linq expressions?

Well, it seems that if instead of passing Func<T, bool>  as a filtering predicate you'll pass it's expression tree, Linq to SQL will be able to step into this tree and build a SQL clause for it.

And how do you build an expression tree from the function? Well, you'll be surprised:



static Expression<Func<int, bool>> NumberIsGreaterThan( int K )
{
    return i => i > K;
}


Compare it with the former definition:



static Func<int, bool> NumberIsGreaterThan( int K )
{
    return i => i > K;
}


Is that all? Does it really make such a huge difference to put Expression<...> over the method's signature?

Well, yes, it does. By putting Expression<...> over method's signature you inform the compiler to automatically build the expression tree of the method rather than just to return the method body.

(you can also build Expressions directly in the code but it's beyond the scope of this post).

This tiny change has significant difference for the Linq to SQL and other Linq to IQueryable implementations - now, as the expression tree is placed in linq expression, the tree walker will likely correctly handle the tree structure for the predicate (even though the predicate body is defined outside of the linq expression).

Is it all?

Unfortunately, no. Defining predicates as Expression<...>  helps the Linq to IQueryable but confuses Linq to Objects!

Why?! Well, Linq to Objects expects predicates to be defined as Func<T,bool> not Expression<Func<T,bool>>! Is there a way to fix it?

Fortunately, yes, there is. You can "deexpression" the expression tree by compiling it using the Compile() method:



static Expression<Func<int, bool>> NumberIsGreaterThan( int K )
{
     return i => i > K;
}
 
...
 
/* all three syntax possibilities */
 
foreach ( int v in
     from i in l
     where NumberIsGreaterThan( 2 ).Compile()(i)
     orderby -i
     select i )
     Console.Write( v );
 
 foreach ( int v in
     from i in l.Where( NumberIsGreaterThan( 2 ).Compile() )
     orderby -i
     select i )
     Console.Write( v );
 
 foreach ( int v in l.Where( NumberIsGreaterThan( 2 ).Compile() ).OrderBy( i => -i ) )
     Console.Write( v );


Be aware of these differences between Linq to Objects and Linq to something-implementing-IQueryable and remember that in both cases you can use clauses defined as methods or properties in the business logic layer of your application.

No comments: