[GSOC] Persistent Data Structures detailed information

Stefanos Baziotis sdi1600105 at di.uoa.gr
Sun Apr 7 18:19:48 UTC 2019


On Sunday, 7 April 2019 at 17:33:10 UTC, Jay wrote:
> Yeah, I have already put by GitHub id and the projects, which I 
> have performed. To support by algorithms and data structure 
> knowledge, I am attaching the links to my Codechef and 
> Hackerrank ids. Would that be good to go??

Hello Jay,

Since I'm also applying for the persistent data structures, I may 
be
able to offer some help. Here's my take on the matter:
I should point out first that a lot of this understanding emerged
from a response from Andrei. However, I'm still waiting for a
review in the proposal and a response for some clarification,
so take this with a grain of salt, this is only my current 
understanding.

So, the main question in implementing persistent data structures
in D is not "How do I implement persistent data structures?". I 
think
this is well-studied and more of a general matter.
Rather, the problem is more specific to D and it is
"How do these data structures handle memory?".
Somewhat quoting Andrei:
> The thing is that most people would not want these DSs to use
> the garbage collector, i.e. employ some sort of reference 
> counting.
> However, changing the reference count destroys the concept of 
> immutability.

Another way to look at it (for me) is that the DS has to allocate 
memory,
but if this memory comes from GC, then the GC might at some point
free a previous version of the DS. Now, one could reason that this
breaks only the theoretical model (i.e. we lost one version, so 
the
DS is not persistent) but not the practical since the GC will only
free unreachable code. But the problem is that if we take
a mark and sweep GC (I really don't know what techniques the D GC
employs), the GC will actually compact the reachable memory 
pages, i.e.
it will move memory around. That means that if you have a direct 
pointer
to some of this memory then access, pointer arithmetic etc. goes
out of the window. I think that this will be a problem since
from what I have seen, in D the GC never works with direct 
pointers.

All this results in that you have to allocate memory out of the 
type system. To
be honest, this is where I start to lose it... I proposed that a 
simple
strongly-typed allocator (i.e. not C Standard Library malloc 
returning void*)
might help to do these allocations but doesn't seem like a good 
solution (both to me
and I guess everyone else).

Good luck!




More information about the Digitalmars-d mailing list