select – Schneide Blog

name │ category │ value ------------------------- A │ X │ 52 B │ X │ 35 C │ X │ 52 D │ Y │ 27 E │ Y │ 31 F │ Y │ 20

Let’s assume we have a simple JavaScript “class” called Module. Each instance of the class has a name, a start() method and a stop() method to manage its lifecycle:

function Module(name) {
    this.name = name;
    console.log("Creating " + this.name);
}
Module.prototype.start = function() {
    console.log("Starting " + this.name);
};
Module.prototype.stop = function() {
    console.log("Stopping " + this.name);
};

We want to create a couple of instances with the names “a”, “b” and “c”. At the beginning of the program we want to start each module, and at the end of the program we want to stop each module. For the creation of the instances we use a map() function call on the names array:

var names = ["a", "b", "c"];
var modules = names.map(function(name) {
    return new Module(name);
});
modules.forEach(function(module) {
    module.start();
});
// do something
modules.forEach(function(module) {
    module.stop();
});

The output is as intended:

Creating a
Creating b
Creating c
Starting a
Starting b
Starting c
Stopping a
Stopping b
Stopping c

Now we want to port this code to C#. The definition of the class is straight-forward:

class Module
{
    private readonly String name;

    public Module(string name)
    {
        this.name = name;
        Console.WriteLine("Creating " + name);
    }

    public void Start()
    {
        Console.WriteLine("Starting " + name);
    }

    public void Stop()
    {
        Console.WriteLine("Stopping " + name);
    }
}

The map() function is called Select() in .NET:

var names = new List<string>{"a", "b", "c"};
var modules = names.Select(
                 name => new Module(name));

foreach (var module in modules)
{
    module.Start();
}

foreach (var module in modules)
{
    module.Stop();
}

But when we run this program, we get a completely different output:

Creating a
Starting a
Creating b
Starting b
Creating c
Starting c
Creating a
Stopping a
Creating b
Stopping b
Creating c
Stopping c

Each module is created twice, and the creation calls are interleaved with the start() and stop() calls.

What has happened?

The answer is that .NET’s Select() method does lazy evaluation. It does not return a new list with the mapped elements. It returns an IEnumerable instead, which evaluates each mapping operation only when needed. This is a very useful concept. It allows for the chaining of multiple operations without creating an intermediate list each time. It also allows for operations on infinite sequences.

But in our case it’s not what we want. The stopped instances are not the same as the started instances.

How can we fix it?

By appending a .ToList() call after the .Select() call:

var modules = names.Select(
        name => new Module(name)).ToList();

Now the IEnumerable gets evaluated and collected into a list before the assignment to the modules variable.

So be aware of whether your programming language or framework uses lazy or eager evaluation for functional collection operations to avoid running into subtle bugs. Other examples of tools based on the concept of lazy evaluation are the Java stream API or the Haskell programming language. Some languages support both, for example Ruby since version 2.0:

range.collect { |x| x*x }
range.lazy.collect { |x| x*x }

	Writing Integration… on Every Unit Test Is a Stage Pla…
	mariuselvert on C# is very strict about modify…
	Anonymous on C# is very strict about modify…
	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…

Tag: select

PostgreSQL’s “DISTINCT ON” clause

The complicated way

The easy way

Be(a)ware of Laziness