Sets
Bill Baxter
dnewsgroup at billbaxter.com
Mon Feb 5 17:08:51 PST 2007
Michiel wrote:
>> The lack of a set type and the use of void[KeyType] for sets has been
>> discussed a few times in the past, I think. It makes sense to me too,
>> at first glance. But I suspect it would require a whole different
>> codepath in the compiler to deal with because you can't have arrays of
>> voids etc. You really want a different data structure to store a set.
>
> I think creating a whole different data structure just for the set isn't worth the
> trouble.
Maybe not for casual use, but if for some reason you are creating huge
sets, you'll probably care whether it uses 1GB or 2GB of memory. It's
not a "whole different" data structure that's required, just a modified
version of the AA data structure that doesn't store values. But still
that makes it a different data structure that will require different
code in D.
I'm kinda just playing devil's advocate here. I was actually one of the
people who previously commented that void[KeyType] should just work.
But if it's going into the core language it better be a slam dunk.
> For now, I've created the following simple code, which I will use for now. It
> still uses dummy values, but hides this behind a template and an array function:
>
> ---------------------------------
> template Set(T) {
> alias bit[T] Set;
> }
>
> void add(T)(inout bit[T] set, T element) {
> set[element] = 1;
> }
> ---------------------------------
> Set!(char[]) set;
>
> set.add("one");
> set.add("two");
>
> assert("one" in set);
> assert("two" in set);
> assert(!("fortytwo" in set));
> ---------------------------------
I think the type 'bit' is deprecated. Anyway, it's just an alias for
'bool' (see phobos/object.d) so you might as well use bool rather than a
type that isn't even listed in the spec:
http://www.digitalmars.com/d/type.html
>
>> And as you mentioned it should have an 'add' method which would be
>> meaningless for ordinary AA's.
>
> For ordinary AA's, another parameter could be passed for the value, which could
> default to the type's init value.
True.
>
>> Following AA syntax could also lead to
>> annoying special cases for generic templates that would have to be
>> prepared for the V==void case:
>> foo(K,V)(V[K] aa) { ...V val; // whoops V could be void! }
>
> Isn't the same thing true for many situations? What if you wanted to compare a V
> object with an integer? The compiler would generate an error for any V for which
> no such comparison is defined. I would simply expect the compiler to complain in
> your example.
Yeh, despite what I said, you could also see sets having AA syntax as
being a *good* thing for templates, since it would let any template that
only cares about keys to match both sets and AA's.
But really it's kind of a bogus argument anyway. Templates should use
duck typing wherever possible. I.e. the template should be written such
that it supports any object with the necessary interface, like .keys and
.values methods or a[x] syntax.
> I'm not saying it's easy. But it should be worth it for the same reason that
> associative arrays are included into the core language.
One issue with sets as AAs is that really one typically thinks of sets
as a collection of "values" not a collection of "keys". If you make a
set type from scratch, you'd be more likely to have a "set.values"
member than a "set.keys" member.
Here's a thought - one way to deal with that would be for the
implementation of void[KeyType] to just return the key itself for any
attempts to access an existing key. So myset["key"] would return "key"
if key was in the array, or throw an exception if not. That way you
could also use ".values" on a void[KeyType] set instead of ".keys", and
handle the issue of what auto = foo in myset; should return.
It looks like a number of things in dmd/src/phobos/internal/aaA.d would
have to be rewritten with special versions for handling void. But it
doesn't look like it would be a huge amount of code. In particular the
main data structure already seems to allocate nodes as a big void* with
size of keysize+valuesize, so that should still be ok with valuesize==0.
Then you'd also need some conditionals added to the code that calls
into the aaA.d code.
--bb
More information about the Digitalmars-d
mailing list