Working functionally with third party libraries

Fri Jul 17 08:41:20 PDT 2015

On Friday, 17 July 2015 at 09:07:29 UTC, Jarl André Hübenthal 
wrote:
> Thanks. Its a lot more cleaner and syntactically readable 
> having .array at the end. But about laziness the same applies 
> to clojure and scala. In clojure you must force evaluate the 
> list, in scala you must to mostly the same as in D, put a 
> toList or something at the end. Or loop it. But its pretty nice 
> to know that there is laziness in D, but when I query mongo I 
> expect all docs to be retrieved, since there are no paging in 
> the underlying queries? Thus, having a lazy functionality on 
> top of non lazy db queries seem a bit off dont you think?

I'm almost certain that the D database driver returns eagerly all 
the results that you've requested. The lazy stuff should happen 
when you start doing range operations after the results are 
returned from the database. It's not impossible to lazily query 
the database, but I think that the developers have chosen the 
eager approach, since it's more straightforward.

Currently, in D most of the laziness is a convention, rather than 
something directly built into the language. There are many 
features that enable (indirectly) effective and easy to use lazy 
algorithms, but these features are have many other uses 
(templates, auto type deduction, compile-time reflection, etc.).

The only two direct features are:
1) foreach can iterate over ranges (objects of structs or classes 
for which isInputRange is true. Here's an example:

import std.algorithm.iteration : map, filter;

foreach (name; persons.filter!(p => p.age > 18).map!(p => p.name))
     writeln(name);

import std.range.primitives : isInputRange;

static assert (
     isInputRange!(
         typeof(
            persons.filter!(p => p.age > 18).map!(p => p.name)
         )
     ) == true
);

See http://dlang.org/phobos/std_range_primitives.html#isInputRange

2) The lazy keyword - when you annotate function parameters with 
lazy they are evaluated not at the caller site, but only when 
needed like in other more traditional functional languages. For 
example:

void calculate(int[] numbers)
{
     import std.format : format;
     // ...

     logErrorIf(numbers[3] < 5,
         format("Expected value < 5, but got %s !", numbers[3]));
     //  ^~~~~~~~~~~~ this is only evaluated

     // ...
}

void logErrorIf(bool condition, lazy string error_message)
{
     if (condition)
         writeln(message);
     //          ^~~~ here, if the condition is true
}

( In D string is just an alias to immutable(char)[], so the above 
signature is identical to this:
void logErrorIf(bool condition, lazy immutable(char)[] 
error_message) )

You can think of lazy parameters as implicit lambdas that return 
the expression passed as argument only when called.
Here you can learn more about the lazy keyword 
http://dlang.org/lazy-evaluation.html

Even though we have 'lazy' built into the language, most of the 
lazy algorithms do not use it. I just made a quick search through 
druntime and phobos for 'lazy' and 'range' (don't how correct it 
was - I admit I'm a unix noob) and here's what I got:

// (I have DMD v2.067.1 installed)

// lazy at the head of the function parameter list or in the tail
$ find /usr/include/dmd/ -name '*.d' -exec cat {} \; | grep -c 
'(lazy \|, lazy '
74

// just containing lazy
$ find /usr/include/dmd/ -name '*.d' -exec cat {} \; | grep -c 
'lazy'
138

// just containing range
$ find /usr/include/dmd/ -name '*.d' -exec cat {} \; | grep -c 
'range'
3548

I think that this because ranges are a more generic, flexible and 
powerful abstraction, and are more efficient maybe because 
they're easier to optimize to simple loops (eg. I've seen that 
the ldc compiler handles them very well).
'lazy' is still useful but generally I have seen it used for more 
simpler stuff (like the above 'lazy' example), and not for 
propagating state through range pipelines (or more simply - 
function chaining).

So you'll see both functions that are lazy and functions that are 
not throughout Phobos (and most use ranges, as you can see from 
the results).

Generally you can distinguish range functions from others by 
their signatures. Since most ranges in D are templated structs 
and not classes inheriting some interface
(though there some, see 
http://dlang.org/phobos/std_range_interfaces#InputRange),
functions that operate on ranges are templated at least on one 
range type:

// Check if the function 'fun' is really a predicate
enum isUnaryPredicate(alias fun, T) =
     is( typeof( fun(T.init) ) : bool);

import std.range.primitives: isInputRange, ElementType;

// templated on predicate and range type
//                ~~~~v~~~~  ~~v~~
auto filter1(alias predicate, Range)(Range range)
     if (isInputRange!Range &&   //       <- some template
         isUnaryPredicate!(predicate, ElementType!Range))
     //  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  constraints
{
     return ...
}

// Even if you can't look at the function body, you
// can guess that it can't be lazy because, it must
// have the whole result, before it returns it.
T[] filter2(alias predicate, T)(const(T)[] arrayToFilter)
     if (isUnaryPredicate!(predicate, T))
     //  ^~~~~~ a bit less template constraints
{
     return ...
}

Here's a full example:
http://dpaste.dzfl.pl/9023c63f9393