[OT] Best algorithm for extremely large hashtable?
qznc
qznc at web.de
Fri Nov 15 12:16:11 PST 2013
On Friday, 15 November 2013 at 19:21:35 UTC, H. S. Teoh wrote:
> On Fri, Nov 15, 2013 at 10:52:15PM +0400, Dmitry Olshansky
> wrote:
>> 15-Nov-2013 22:41, H. S. Teoh пишет:
>> >This isn't directly related to D (though the code will be in
>> >D), and
>> >I thought this would be a good place to ask.
>> >
>> >I'm trying to implement an algorithm that traverses a very
>> >large
>> >graph, and I need some kind of data structure to keep track
>> >of which
>> >nodes have been visited, that (1) allows reasonably fast
>> >lookups
>> >(preferably O(1)),
>>
>> Store as a bit-flag in whatever structure that serves as node.
>> It's
>> going to be the fastest anyway and you indeed get to use only
>> 1 bit
>> per node (and given padding and whatnot you may already have a
>> couple
>> of bytes to spare per node).
> [...]
>
> There are too many nodes to keep in memory, actually. The
> algorithm also
> doesn't need to store the nodes; it can easily generate them
> on-the-fly
> when it needs them. The challenge is when it needs to know
> whether a
> particular generated node has been visited before. I'd like to
> be able
> to have a fast lookup (hopefully O(1)) that tells me whether
> node X has
> been seen before. The problem is, there may have been a very
> large
> number of previously-seen nodes, so I need some way of storing
> this
> information in a space-efficient way.
>
> I'm hoping for a kind of compressed representation where
> multiple
> adjacent nodes can be represented in a very compact way, by
> somehow
> taking advantage of the fact that they lie on an n-dimensional
> grid and
> the fact that graph edges are always a subset of grid edges (as
> opposed
> to edges between two arbitrary nodes).
A Trie might be an idea, especially the bitwise variant.
http://en.wikipedia.org/wiki/Trie
More information about the Digitalmars-d
mailing list