AA key conversion woes

Steven Schveighoffer schveiguy at yahoo.com
Wed Apr 18 04:36:50 PDT 2012


On Tue, 17 Apr 2012 14:41:42 -0400, H. S. Teoh <hsteoh at quickfur.ath.cx>  
wrote:

> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.

This is probably the best solution.  I will note that dcollections does  
not require immutable keys for maps, and likely would need work to be able  
to do immutable keys anyway.

The one part where I think immutable keys can be hindering/confusing is  
for classes.  A coder may write something like:

class Foo
{
    private int _x;
    @property int x() { return _x;}
}

In effect, Foo is immutable, since there is no valid way to change Foo.   
But having Foo not be actually marked as immutable is nice in that you  
don't have to deal with the lack of tail-immutable for classes.

In such a case, effectively, int[Foo] is really never going to be a  
problem.  But int[immutable(Foo)] will not allow you to access the x  
property.

Yes, there are better ways to mark Foo, but it still poses an unnecessary  
restriction.

> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.

Sounds good (think you meant int[const(int)])

> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.

My first reaction is, just don't allow it.  If you are going to store  
immutable keys, require immutable keys.

BTW, immutable(int[]) is likely going to be required to be stored as a  
key, since this can also be a key:

struct S
{
    int[]
}

and there's no way to turn that into immutable(int)[].  So you will have  
to deal with this situation, might as well do it now.

> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?

.dup and .idup should not be used lightly.  Specifically, setting a hash  
value should be an amortized O(1) operation.  Using idup will make it  
O(unbounded) :)

> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)

I think with the current implementation, inout is disallowed on a member  
field.  And for good reason.  inout only really makes sense inside an  
inout function.

> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable. This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys, etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).

I think it already does this.  But I don't like it.  I think it's better  
to just require immutable and be done with it.

People will say we can't do that because int[int[]] currently compiles,  
but somehow it was ok to make int[int[]] silently turn into  
int[immutable(int[])], which broke quite a bit of code.  I see this as a  
similar scenario.

> But I'm expecting some negative reactions to this hardline approach. :-)
>
> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T


More information about the Digitalmars-d mailing list