AA key conversion woes
Michael Rynn
michaelrynn at optusnet.com.au
Thu Apr 19 01:46:14 PDT 2012
On Wed, 18 Apr 2012 15:06:11 +0200, SomeDude wrote:
> On Tuesday, 17 April 2012 at 18:51:07 UTC, H. S. Teoh wrote:
>> On Tue, Apr 17, 2012 at 08:44:42PM +0200, Andrej Mitrovic wrote:
>>> On 4/17/12, H. S. Teoh <hsteoh at quickfur.ath.cx> wrote:
>>> > But even then, I'm considering if .keys should return a mutable
>>> > array if the key is a value type
>>>
>>> How about having .keys for mutable and .ikeys for immutable returns?
>>
>> That's one possibility. Then .keys can come with the caveat that it may
>> involve hidden performance costs in the form of internal .dup's. But
>> I'm not sure this is a good idea, because there are too many
>> string-keyed AA's in existing code that will now suddenly become very
>> slow because they use .keys instead of .ikeys, and the strings are
>> getting .dup'd all over.
>>
>>
>> T
>
> Hidden duplications are not acceptable in my opinion.
First of all, distinguish between back-end implementation of AA, and
template interface(s), which may or may not, add restrictions and "extra"
information and functions, to please one's personal idealogical D purity
and immutability.
Backend rt.aaA, currently is generic pointer / TypeInfo based, which is
good for performance and reducing code bloat. Current version does not
handle D types as well as it could. Actual BB struct, only holds TypeInfo
reference for key type, which has to be initialised on adding first key.
By generally returning a pointer to node value slot, aaA allows the
compiler to supply the missing value TypeInfo work, which works for a
good part of the functionality (but not on key removal/node deletion, for
instance).
aaA does not handle postblit for keys correctly.
Structs with postblit and/or complex destructors are not properly done
either.
This can be seen by using a reference counted struct. Normal D code
handles the postblit, destructors properly, (meaning I haven't tested
really enough code). Entanble them in AA as key or value, and counting
will not work as expected. I know I'm eccentric, so please ignore the
question of why reference counting. Immutable keys would ban postblit
and destructors for struct.
However it is possible to have aaA implementation, with only a small cost,
that handles struct postblit and destructors correctly, and refcounting
(as an example, please), comes out correctly.
I have working code in folder rt, viewable at
http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/files/
head:/rt/
There is an IAA, defining a low level interface (really, real interface).
Maybe dmd could learn to call interface directly instead of indirect C
functions in aaA. However the overhead is very slight.
Not possible to have aaA integrated with this without using template, as
dmd might not supply initilization TypeInfo for value type. Integration
with current aaA C functions almost worked, but not quite, after I found
I needed to initialise value TypeInfo.
Need to declare with the aaI.AssociativeArray template.
The template below will support non-immutable keys for lookups for string
types, but not for replace or insert of keys. This is why inX method is
const void* key, believe it or not, so it is easily persuaded to take
string or char[]. delX should be const as well, may have to fix this.
In order to demonstrate IAA is more or less complete and working
interface for an aaA, I have made 2 different implementations, which use
IAA
http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/view/
head:/rt/aaSLink.d, a direct rip off, with above mentioned few warts
erased, of the current rt.aaA implementation.
and
http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/view/
head:/rt/aaArrayMap.d
which uses entirely arrays, (even a double linked list of integer indexes
is an array).
I'm telling another lie here, it manages a partial implementation of
untyped arrays, rt.unarray, which uses TypeInfo to manage memory content
for keys and values, including postblit annoyances, to
implement the genericity implied by IAA.
Its a little bit slower, which shows a bit with over a million keys, but
acceptable. arrayMap maintains insertion order, as long as no key
deletions followed by insertions.
Now that there is 2 , or maybe more potential implementations, need a
factory method, to create a default, if no deliberate initialisation.
Thats currently done in rt.aadefault,
which can use a version flag, to initialise the global factory.
alias IAA function(TypeInfo keyti,TypeInfo valueti) createDefaultAA;
shared createDefaultAA gAAFactory;
interface IKeyRange {
void popFront();
AAKeyRef front();
bool empty();
}
interface IValueRange {
void popFront();
AAValueRef front();
bool empty();
}
interface IAA {
/// equals checks same implementation.
bool equals(IAA other, TypeInfo_AssociativeArray rawTi);
/// setup types. Won't work
/// void init(TypeInfo_AssociativeArray aati);
/// setup types
void init(TypeInfo kti,TypeInfo vti);
/// implement initialisation from a literal va_list. Won't work.
///void init(TypeInfo_AssociativeArray ti, size_t length, ...);
/// implement initialisation from a literal. Won't work.
/// void init(TypeInfo_AssociativeArray ti, void[] keys, void[]
values);
/// return number of stored key-value pairs as a
property
size_t length();
/// append all the values from array index pairs, matching key
and value types.
/// return increase in size
void append(void[] keys, void[] values);
/// append or replace single key value pair. return 0 if
successful (for apply2 delegate)
//int append(void* key, void* value);
dg2_t getAppendDg(); // return a int append(void* key, void*
value);
/// reference counting , or not
void release();
void addref();
/// rehash. Return IAA
IAA rehash();
/// Range interface for keys
IKeyRange byKey();
/// Range interface for values
IValueRange byValue();
/// remove key value pair
bool delX(void* pkey);
/// get, and return if location was created or was existing
void* getX(void* pkey, out bool isNew);
/// Return existing or null
void* inX(const void* pkey);
/// return unique copy of array of values
ArrayRet_t values();
/// return unique copy of array of keys
ArrayRet_t keys();
/** At least length 2, 0 = length of hash map. 1 = count of nodes
which are empty.
2 = count of nodes with chain length 1, and so on, to length of
array.
*/
uint[] statistics();
/// return new instance with copy of each key-value pair
//IAA dup();
/// Delegate for each key
int apply(dg_t dg);
/// Delegate for each key and value
int apply2(dg2_t dg);
/// Typeinfo for key.
TypeInfo keyType();
/// Typeinfo for value.
TypeInfo valueType();
/// remove all pairs
void clear();
/// Get new interface with same configuration, but empty
IAA emptyClone();
}
// how D will not see it
struct AA
{
IAA a;
}
alias IAA function(TypeInfo keyti,TypeInfo valueti) createDefaultAA;
shared createDefaultAA gAAFactory;
struct AssociativeArray(Key, Value)
{
private:
IAA p;
public:
alias AssociativeArray!(Key,Value) MyAAType;
/// Support non-immutable lookups for immutable string types.
/// TODO: make this more concise and broader?
static if (isSomeString!Key)
{
static if (is(Key==string))
{
alias const(char)[] CKey;
}
else static if (is(Key==wstring))
{
alias const(wchar)[] CKey;
}
else static if (is(Key==dstring))
{
alias const(dchar)[] CKey;
}
}
else {
alias Key CKey;
}
@property size_t length() { return (p !is null) ? p.length : 0; }
void init(createDefaultAA factory)
{
if (p)
{
p.release();
p = null;
}
p = factory(typeid(Key), typeid(Value));
}
/// Post-Blits
this(this)
{
if (p)
{
p.addref();
}
}
~this()
{
if (p)
{
p.release();
p = null;
}
}
MyAAType rehash() @property
{
p = p.rehash();
return MyAAType(p);
}
Value[] values() @property
{
return (p !is null) ? *cast(Value[]*) p.values() : null;
}
Key[] keys() @property
{
return (p !is null) ? *cast(Key[]*) p.keys() : null;
}
int opApply(scope int delegate(ref Key, ref Value) dg)
{
return p !is null ? p.apply2(cast(_dg2_t)dg) : 0;
}
int opApply(scope int delegate(ref Value) dg)
{
return p !is null ? p.apply(cast(_dg_t)dg) : 0;
}
void opAssign(Value[Key] op)
{
if (p is null)
init(gAAFactory);
auto dg = p.getAppendDg();
foreach(k,v ; op)
{
dg(&k,&v);
}
}
void opCatAssign(Value[Key] op)
{
if (p is null)
init(gAAFactory);
auto dg = p.getAppendDg();
foreach(k,v ; op)
{
dg(&k,&v);
}
}
void opCatAssign(MyAAType appendMe)
{
auto q = appendMe.p;
if (q is null)
// nothing to do
return;
if (p is null)
p = q.emptyClone();
q.apply2(p.getAppendDg());
}
Value opIndex(CKey key)
{
auto pvalue = (p !is null) ? p.inX(&key) : null;
if (pvalue is null)
throw AAError.error(AAError.OPINDEX_FAIL);
return *(cast(Value*) pvalue);
}
Value opIndex(CKey key, lazy Value defaultValue)
{
auto pvalue = (p !is null) ? p.inX(&key) : null;
return p ? *(cast(Value*) pvalue) : defaultValue;
}
Value get(CKey key, lazy Value defaultValue)
{
auto pval = (p !is null) ? p.inX(&key) : null;
return pval ? *(cast(Value*) pval) : defaultValue;
}
Value* opIn_r(CKey key)
{
return ( p !is null) ? cast(Value*) p.inX(&key) : null;
}
// passed value is stored if it didn't exist, returned if it did.
bool getOrPut(Key key, ref Value val)
{
if (p is null)
init(gAAFactory);
bool existed = void;
auto spot = cast(Value*) p.getX(&key, existed);
if (!existed)
*spot = val;
else
val = *spot;
return existed;
}
void put(Key key, Value val)
{
if (p is null)
init(gAAFactory);
bool existed = void;
auto spot = cast(Value*) p.getX(&key, existed);
*spot = val;
}
uint[] chainLengths()
{
if (p is null)
return [0,0];
return p.statistics();
}
void opIndexAssign(Value val, Key key)
{
if (p is null)
init(gAAFactory);
bool existed = void;
auto spot = cast(Value*) p.getX(&key, existed);
*spot = val;
}
MyAAType dup()
{
MyAAType result;
if (p !is null)
{
auto q = p.emptyClone();
p.apply2(q.getAppendDg());
result.p = q;
}
return result;
}
void clear()
{
if (p !is null)
{
p.clear();
}
}
bool remove(CKey key)
{
return (p !is null) ? p.delX(&key) : false;
}
@property auto byKey()
{
return (p !is null) ? p.byKey() : null;
}
@property auto byValue()
{
return (p !is null) ? p.byValue() : null;
}
}
More information about the Digitalmars-d
mailing list