[Feature Request] Adding to D lang "set" built-in

Russel Winder russel at winder.org.uk
Mon May 7 06:59:45 PDT 2012


On Sun, 2012-05-06 at 23:23 -0700, Jonathan M Davis wrote:

<Also, effectively, replying to David Nadlinger's email here as well
since his thoughts echoed much of what is here...>

> There are multiple ways to implement a set or a map. If a programmer knows 
> their data structures (as one would hope that they would), then they know the 
> difference between a hash set and a tree set (or hash map and tree map), and 
> they'll pick the one that's most appropriate for what they're doing. And 
> because the containers are very similar, it should be quite possible in many 
> cases to replace one with the other quite easily - especially if you're using 
> templates. That's one of the reasons that Java has the Set interface with 
> HashSet and TreeSet implenting it.

Indeed. And it is clear that Java's (and C++, but to a lesser extent)
obsession with coding only to interfaces with no concern for the
properties of the realization underneath can rapidly lead to appalling
performance of programs.  Classic example is sorting and the List
interface -- though the switch to Mergesort and modified Timsort with
hidden reflection did ameliorate most of the problems for Java 7.

> std.container is designed with the idea that containers will be named after 
> their data structures and _not_ what they're used for. Programmers can then 
> select the data structure that best serves their purposes. And I'd be worried 
> about any professional programmer who didn't know that you can use a red-black 
> tree as a set or a map.

Works for me, but...

Having done more training of middle of the road programmers recently, I
am appalled at how poor the average programmer turns out to be. In
particular, their knowledge of data structures is at a level where, were
I still in academia, they would fail and be ejected from the computer
science programme.

The level seems to be:

	array
	map == associative array == hash table
	list == sequence == singly-linked list

anything else is SEP. Sadly very much a "I just know enough to not get
sacked and pick up my pay packet" attitude. Mention red-black tree, 2-3
tree, trie, b-tree, and they glaze over.

My earlier comments we flavoured by the need for a language to have an
API that can cope with these folks as well as the folks who, like
everyone on lists such as this one, appreciate that there is always much
more to data structures than you think at first sight. Big-O, omega,
theta, etc. analysis is important and interesting but somehow there
needs to be a layer that gets away from this to provide a default, not
poor, implementation.

If I go in, I shall just rant. Hopefully I have now made the point in
this email that I should have made in the earlier email

> It's clearly a hash map. If they want a tree map, then we have RedBlackTree 
> (though we should probably provide a wrapper that makes it easier to use 
> RedBlackTree as a map, since it's a bit unwieldy to do that at the moment). I 
> don't see the problem.

I think it would be good for there to be façades to the implementing
data structure to provide a functional indication.

Perhaps the observation is that where a data structure type such as map
or set is realized in the core language there is a single
implementation. Where things are library based then you get the naming
to show function and realization. The Python experience is that more
programmers do more reasonable work with list, tuple, map, set all
having a single realization. Where more care is needed then there are
well crafted libraries to work with, not least of which is NumPy.

 

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder at ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel at winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20120507/863a7642/attachment.pgp>


More information about the Digitalmars-d mailing list