Skip to Content

Blog

The sdg Blog is designed to show you who we are and what we're all about. Whether you're looking to read up on the latest technologies, trying to improve your soft skills, or wondering what we've been up to, our blog offers something for everybody.

Back

Reasoning at Runtime: CLI Expression Trees

I confess. I used CLI expression trees without understanding them. I dove into every Linq-to-X, happy to treat it like a magic black box. Linq providers accept lambda expressions to filter, transform, and order results. I assumed these lambdas defined delegates and didn’t realize they were part of a new way to represent and bind instructions.

Later, I stumbled on expression trees without Linq and knew I missed something big. There was something special about them, but exactly what wasn’t immediately clear. Online examples didn’t agree. Some were impenetrable, while others solved problems that could easily be solved without expression trees. I knew there must be a simple task that’s easy with an expression tree but impossible with any other CLI language feature. This is what I found.

The Basics

An expression tree is a tree data structure that stores expressions in its nodes. What does that mean? Starting from the leaves and moving toward the root, each node holds a value or an operation that results in a value. (Actually, expression trees are not limited to expressions. Statements are also allowed. An expression may return void). This value is then available to its parent. When you reach the root, the value remaining is the value of the expression. Consider an expression that squares an integer:

Expression square = (x) => x * x;

Its tree would look like this:

We can build the same tree explicitly with code:

var parm = Expression.Parameter (typeof(int), "x");
var body = BinaryExpression.Multiply (parm, parm);
var square = Expression.Lambda (body, parm);

Conceptually, during evaluation an integer is bound to the ParameterExpression “x”. This integer is then used in the BinaryExpression via its Left and Right reference. The BinaryExpression multiplies Left and Right and the result works its way to the root as the final value of the tree.

In reality, expression trees aren’t evaluated as trees. They’re compiled to delegates at runtime and the delegates are executed. This is an important distinction. Expression trees are compiled at runtime. This provides flexibility that was previously only available via features like reflection. The question is, can we do things with this flexibility that we can’t with reflection?

More examples

A favorite non-Linq expression tree task is the dynamic sort. An implementation might look like this:

public static IEnumerable DynamicSort (
    IQueryable items, string propertyName, bool ascending)
    {
        var parm = Expression.Parameter (typeof(T), "i");
        var conversion = Expression.Convert (
Expression.Property(parm, propertyName), typeof(object));
        var getProperty = Expression.Lambda (conversion, parm);
 
        return ascending ? items.OrderBy (getProperty) :
items.OrderByDescending (getProperty);
    }

It doesn’t take long to imagine a non-expression tree alternative:

public static IEnumerable DynamicSort (
    IEnumerable items, string propertyName, bool ascending)
    {
        var propInfo = typeof(T).GetProperty (propertyName);
        return ascending ? items.OrderBy (i => propInfo.GetValue (i, null)) :
                items.OrderByDescending (i => propInfo.GetValue (i, null));
    }

Sure, the reflection-driven method will likely be slower since the expression tree can take advantage of compilation, but given enough time, there’s nothing the expression tree can do that reflection can’t.

Other examples emphasize the replacement of string constants with expressions or functional composition to increase code reuse. Both are interesting and valuable, but neither achieve behavior that isn’t possible without expression trees. We already minimize the danger of raw strings with constants and enums. In addition, functional composition is possible with Func and Action delegates. Expression trees aren’t required.

The problem may be that non-dynamic CLI developers are conditioned to design with a compile-time attitude. When we wire-up validators, map objects, or even sort on-the-fly, we know all of the possible outcomes at compile-time. Runtime behavior is convenient, but it’s not necessary. So, what sort of things can we only do when the program is running?

Analyzing Predicates

Imagine an API that manages albums. It provides a way for users to filter album results based on a predicate.

public IEnumerable Filter(Func predicate)
{
    Albums.All().Where(i => predicate(i));
}

This works great as long as your album count is manageable and your users write well-behaved predicates. What happens if they don’t? Imagine you discover that excessive or (||) conditions are killing your performance.


var items = Filter ((i) => i.Year > 2000 || i.Genre == Genre.GarageRock
                               || i.Artist.Equals ("Jens Lekman")
                               || i.Year < 1976 && i.Artist.IndexOf(“Oh Sees”) > -1
                               || i.Title.StartsWith ("Thee")).ToList ();

As Filter is written, there’s little you can do. You can warn users with documentation and blame them when they ignore it, but they’ll still believe you provided a slow API. If you use expression trees, however, you could enforce a runtime “contract” that prevents or conditions altogether.

public bool HasOrNode (Expression expression)
{
    var binary = expression as BinaryExpression;
    if (binary != null) {
        if (binary.NodeType == ExpressionType.OrElse)
            return true;
        else
            return HasOrNode (binary.Left) || HasOrNode (binary.Right);
    }
 
    return false;
}
 
public IEnumerable Filter (
    Expression predicateExpression)
{
    if (HasOrNode (predicateExpression.Body))
        throw new Exception ("No OrElse predicates please.");
 
    var predicate = predicateExpression.Compile ();
    Albums.All().Where(i => predicate(i));
}

HasOrNode is woefully underpowered to handle an actual expression tree, but it illustrates a point. We are responding to something we can only know at runtime. Not only that, we’re responding to code versus data. This, finally, is what makes expression trees unique. Manipulating both code and data is powerful. Lisp hackers have known this for decades. In the CLI, it’s what allows Linq providers to transform expression trees to SQL or XPath queries.

As these ideas sunk in, I imagined new possibilities. You could:

  • Log actions versus data
  • Change data strategies based on common actions versus the data itself
  • Rewrite complex trees into more efficient forms

Expression trees add a dynamic touch to C# and VB.NET. They’re not necessarily pretty or easy-to-use, but they solve a class of problems that were previously unsolvable in the language. Get to know them. Chances are you’re already using them.