hashed array?
Dmitry Olshansky
dmitry.olsh at gmail.com
Mon Nov 12 09:36:18 PST 2012
11/12/2012 8:52 PM, Joseph Rushton Wakeling пишет:
> On 11/12/2012 05:27 PM, Dmitry Olshansky wrote:
>>> The current "workaround" is to just dummy it.
>>> alias bool Dummy;
>>> Dummy[string] names;
>>> names["Paul"] = true; //Ew.
>>
>> Make a wrapper?
>
> The problem with wrapped versions of associative arrays is that they
> just don't scale with number of keys. If I want to store (say) a set of
> 3 million size_t's, then it costs a lot more than 3_000_000 *
> size_t.sizeof() to do so this way.
>
> Do dedicated set implementations do better than this?
Yes, sure they can. Starting with e.g. a bitset.
If the distribution is sparce then inversion lists can do wonders. They
work especially well if your distribution has contiguous intervals,
searching would be logN though.
I use one for unicode character sets and it saves a LOT of space.
Otherwise if it's sparce but doesn't have contiguous interval it seems
entirely doable to combine the two in a universal datastructure for
integral sets. Say a sorted array of 32/64-element bitsets (depending on
architecture):
//denotes a small bit set covering interval of [start..start+32/64)
struct node{
size_t start;
size_t bits;
};
Should be quite compact & fast.
>
> I should say that my own interests in an implementation of set are
> twofold -- first, efficient storage; and second, being able to do an
> efficient foreach across the stored values, without any concern for order.
>
>> Or rather call them sets and have them in the library.
>
> The whole builtin-vs.-library thing seems a big distraction -- if
> something needs to be implemented, the library seems the natural place
> unless there's a very good reason otherwise.
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list