Java streams Vs LINQ Vs D
bearophile
bearophileHUGS at lycos.com
Wed Mar 27 15:19:00 PDT 2013
Linked on Reddit I've seen a nice comparison of Java streams Vs
dotnet LINQ:
http://blog.informatech.cr/2013/03/24/java-streams-preview-vs-net-linq/
Despite they aren't a complete list, those little challenges are
well chosen, they are operations done commonly. So I have
translatated them to D with Phobos. For most of them I have found
a nice D translation. But few of them uncover holes in Phobos,
that I alredy know. (Maybe some of them are not really Phobos
holes, but just my lack of knowledge about Phobos and D. So your
better solutions are welcome).
If you want to read the whole list of my translations:
http://codepad.org/0KtXu7nh
Below I list just the five troubled challenges, with the LINQ
solution followed by one or more D solutions.
For all the solutions I import several modules:
import std.stdio, std.algorithm, std.range, std.typecons,
std.traits,
std.array, std.string;
- - - - - - - - - - - -
Challenge 2: Indexed Filtering
Find all the names in the array "names" where the length of the
name is less than or equal to the index of the element + 1.
string[] names = { "Sam", "Pamela", "Dave", "Pascal", "Erik" };
var nameList = names.Where((c, index) => c.Length <= index +
1).ToList();
In D:
auto names2 = ["Sam","Pamela", "Dave", "Pascal", "Erik"];
auto nameRange = iota(size_t.max)
.zip(names2)
.filter!q{ a[1].length <= a[0] }
.map!q{ a[1] };
nameRange.writeln;
On Bugzilla I have proposed to add an enumerate():
http://d.puremagic.com/issues/show_bug.cgi?id=5550
With it the D code improves:
auto nameRange2 = names2
.enumerate
.filter!q{ a[1].length <= a[0] }
.map!q{ a[1] };
nameRange2.writeln;
If D gains a syntax to unpack tuples in function signatures the
code becomes (untested):
auto nameRange2 = names2
.enumerate
.filter!((i, n) => n.length <= i)
.map!q{ a[1] };
nameRange2.writeln;
Beside adding enumerate() that is useful in many other
situations, another (not alternative!) idea is to add
iFilter/iMap (that mean indexed filter and indexed map), where
the filtering or mapping function is supplied by an index+item
2-tuple:
auto nameRange2 = names2.iFilter!((i, a) => a.length <= i);
Or equivalently:
auto nameRange2 = names2.iFilter!q{ a.length <= i };
Those ifilter/imap functions are present in the standard library
of the F# language.
- - - - - - - - - - - -
Challenge 3: Selecting/Mapping
Say we have a list of names and we would like to print “Hello” in
front of all the names:
List<string> nameList1 = new List(){ "Anders", "David", "James",
"Jeff", "Joe", "Erik" };
nameList1.Select(c => "Hello! " + c).ToList()
.ForEach(c => Console.WriteLine(c));
In Phobos there is no forEach(), so you have to use foreach:
auto nameList1 = ["Anders", "David", "James", "Jeff", "Joe",
"Erik"];
foreach (name; nameList1)
writeln("Hello! ", name);
The only advantage I see of a forEach() over foreach() is that
it's usable at the end of an UFCS chain.
- - - - - - - - - - - -
Challenge 12: Grouping by a Criterium
Group the elements of a collection of strings by their length.
string[] names = {"Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"};
var groups = names.GroupBy(c => c.Length);
In Phobos there is a group() but it can't be used here because it
returns just one of the equivalent grouped items. And I can't use
std.array.assocArray for similar reasons.
auto names3 = ["Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"];
string[][size_t] groups;
foreach (name; names3)
groups[name.length] ~= name;
groups.byValue.writeln;
Andrei has recently written a groupBy, not yet merged:
https://github.com/D-Programming-Language/phobos/pull/1186
Using that future groupBy the D code improves a little (untested.
In DMD 2.063 schwartzSort accepts a string literal too):
auto names3 = ["Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"];
auto groups = names3
.schwartzSort!q{ a.length }
.groupBy!q{ a.length == b.length };
groups.writeln;
By the way, I like Python for having a free len() function that's
usable for higher order functions like map and filter. In Phobos
there is walkLength():
auto names3 = ["Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"];
auto groups = names3
.schwartzSort!walkLength
.groupBy!q{ a.walkLength == b.walkLength };
groups.writeln;
Unlike schwartzSort the Phobos group/groupBy use a comparison
function like "a.length == b.length" instead of a less flexible
but more handy single function like "c => c.Length". So I'd like
something like a keyGroup/keyGroupBy that accepts a
single-argument function as schwartzSort. (And I'd like
schwartzSort to be renamed "keySort").
auto names3 = ["Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"];
auto groups = names3
.schwartzSort!walkLength
.keyGroupBy!walkLength;
groups.writeln;
Another problem with group/groupBy is that they work by sorting.
But a hash-based O(n) group/groupBy is also conceivable,
potentially faster, and leading to simpler code, because you
don't need to sort the items first:
auto names3 = ["Sam", "Samuel", "Samu", "Ravi", "Ratna",
"Barsha"];
auto groups = names3.hashKeyGroupBy!walkLength;
groups.writeln;
Uhm. The name "hashKeyGroupBy" is becoming a bit too much complex
:-) So maybe it's better not go there.
- - - - - - - - - - - -
Challenge 13: Filter Distinct Elements
Obtain all the distinct elements from a collection.
string[] songIds = {"Song#1", "Song#2", "Song#2", "Song#2",
"Song#3", "Song#1"};
var uniqueSongIds = songIds.Distinct();
This is not too much bad in D, there is uniq(), but first you
need to .sort or .idup.sort or .array.sort the original
array/range:
auto songIds = ["Song#1", "Song#2", "Song#2", "Song#2",
"Song#3", "Song#1"];
auto uniqueSongIds = songIds.sort().uniq;
uniqueSongIds.writeln;
A hash-based uniq that doesn't need a previous sorting is
conceivable. But see also below.
- - - - - - - - - - - -
Challenge 14: Union of Two Sets
Join together two sets of items.
LINQ
List<string> friends1 = new List<string>() {"Anders",
"David","James",
"Jeff", "Joe",
"Erik"};
List<string> friends2 = new List<string>() { "Erik", "David",
"Derik" };
var allMyFriends = friends1.Union(friends2);
This seems a bit too much complex to do in D+Phobos:
auto friends1 = ["Anders", "David","James", "Jeff", "Joe",
"Erik"];
auto friends2 = ["Erik", "David", "Derik"];
auto allMyFriends =
friends1.sort().setUnion(friends2.sort()).uniq;
allMyFriends.writeln;
Note that you have to call uniq at the end because that's not a
set union, it's a badly named function. A better name for it is
"bagUnion" because it doesn't remove the duplications, and a set
operation should.
For the Challenge 13 and 14 I suggest to not add more functions
to std.algorithm, and instead just rely on a set data structure,
as in Python:
>>> song_ids = ["Song#1", "Song#2", "Song#2", "Song#2", "Song#3",
>>> "Song#1"]
>>> set(song_ids)
set(['Song#1', 'Song#2', 'Song#3'])
>>> friends1 = ["Anders", "David","James", "Jeff", "Joe", "Erik"]
>>> friends2 = ["Erik", "David", "Derik"]
>>> set(friends1).union(friends2)
set(['Erik', 'Joe', 'Jeff', 'Derik', 'James', 'Anders', 'David'])
In my D1 dlibs I had a Set!T data structure (with a set() helper
function) that offered a similar syntax (here I use D2 UFCS):
auto songIds = ["Song#1", "Song#2", "Song#2", "Song#2",
"Song#3", "Song#1"];
auto uniqueSongIds = songIds.set;
auto friends1 = ["Anders", "David","James", "Jeff", "Joe",
"Erik"];
auto friends2 = ["Erik", "David", "Derik"];
auto allMyFriends = friends1.set.united(friends2);
- - - - - - - - - - - -
Bye,
bearophile
More information about the Digitalmars-d
mailing list