Call for arms: associative array reimplementation

H. S. Teoh hsteoh at quickfur.ath.cx
Tue Feb 26 11:05:07 PST 2013


On Thu, Feb 21, 2013 at 09:33:53PM +0400, Dmitry Olshansky wrote:
> 21-Feb-2013 14:49, Dicebot пишет:
> >I have had a small e-mail conversation with H. S. Teoh recently
> >regarding his old attempt to reimplement built-in associative arrays
> >and fix various issues they create. His experience and memories are
> >written down in wiki now:
> >
> >http://wiki.dlang.org/AA_Implementation_Issues
> >
> >I still do want to continue this task, but this happened to be much
> >more tricky and I just been walking circles around so far. So if
> >anyone has some spare time and wants to show his D Kung Fu - this
> >wiki article is probably worth checking out :) As far as I know this
> >is kind of "pre-approved" change.
> 
> I'd love to help but I've got my hands full already.
> 
> In essence the only way I see this ball rolling:
> 
> a) make a hacked version of compiler that actually **cking gets rid
> of V[K] by replacing it with __AA!(V, K) the moment it sees it.
> Literals are structs of type __AALiteral!(V,K), both could be
> defined anywhere but are expected in object.d.
> 
> Yes, that means error messages, is(...) matching and so on all see
> and expose this template. NO reverse conversion or special casing to
> have it as built-in type must be done anywhere, period. Technically
> advanced users should be able to replace built-in __AA template with
> something custom.

Hmm that's a good idea. Hack the compiler to kill off all instances of
the old code first, then work on it until things start to work again.
Better than trying to sort out the current mess, which is not only split
across object.d, aaA.d, and dmd internals, but within dmd is also
sprinkled everywhere like powder. Better to rip the whole thing out and
start over.

The thing that would be needed to pull this off, though, is to have a
large-enough corpus of D code that uses AA's, hopefully in non-trivial
ways, so that we can cover all the use cases (i.e. ferret out hidden
compiler paths that produce references to the old code). The same corpus
can then be used to test the new implementation by failing to compile or
work until everything is done. We can incrementally work on the new
stuff, say reduce compile errors one by one, until everything works
again.


[...]
> The tricky part about immutable values isn't tricky at all. At the
> moment you just blast your way through with casts. Since AA
> implementation is in sole control of memory and it's definitely on
> HEAP there is no risk of writing to ROM/protected stuff. It's more
> an act of courage then trickery ;)

The problem is that if the key is an array, it's *not* in the sole
control of the AA implementation. User code may have outside slices that
reference the same memory. That's the problem.

I just got an idea about how to solve immutable key problem. We can
require that POD AA keys must be immutable, but user-defined
structs/classes are allowed to be non-immutable. The rationale is that
POD's hash values are computed over their entire value, so we know for
sure that if their value changes, the hash will change, so this is not
allowed. But user types can define their own toHash that only depends on
a subset of the fields, and we need to allow that, so in that case the
onus is on the user to not change things externally. D is a systems
programming language, so we shouldn't need to be babysitting the users.
If they can be trusted to write a sane toHash function (because the AA
will malfunction badly if toHash misbehaves), then they can also be
trusted not to shoot themselves in the foot by externally changing an
object being used as an AA key.


> P.S. The new AA implementation IMHO would fix as much as it would
> break simply because the current one is half-magic, half-bug. That's
> one of reasons it should be done by reaping it off, then doing it from
> scratch in the clean room at the side and then re-introduced it again.
[...]

I like this approach. I'd go ahead with it, except that so far I haven't
touched the DMD code yet, and I'm unfamiliar with how it works. It would
be nice if somebody familiar with DMD internals can chip in too.


T

-- 
The peace of mind---from knowing that viruses which exploit Microsoft
system vulnerabilities cannot touch Linux---is priceless. -- Frustrated
system administrator.


More information about the Digitalmars-d mailing list