2.011 invariant question

Sat Feb 23 08:25:31 PST 2008

On Sat, 23 Feb 2008 13:37:10 +0000, Janice Caron wrote:

> On 23/02/2008, Sönke Ludwig <ludwig at informatik_dot_uni-luebeck.de> wrote:
>> I think that if it's fixed, it has to be done transparently or it my be
>>  difficult to explain why s = s ~ "x"; is allowed but s ~= "x"; is not.
> 
> You know, the more I think about it, the more I come to the conclusion
> that you're right, and that
> 
>     x ~= y;
> 
> should behave /exactly/ like
> 
>     x = x ~ y;
> 
> and cause a reallocation every time. Incremental string building is
> the usual reason given why this isn't the case, but I think there's
> another way round that problem.
> 
> Think about what happens at runtime when x ~= y; is executed. First,
> the compiler has to decide whether or not x is resizable (that is,
> does it point to the start of a resizable block?). Well - how does it
> know that? Answer: it has to ask the garbage collector. OK, so how
> does the garbage collector know that? Well, presumably, the gc has to
> have some sort of map or table, so it can look up the address x, and
> if it's in the table, then the answer is yes. But that lookup
> operation has to be slower that O(1). No matter how you look at it,
> there has to be at least O(log N) CPU time being consumed there, where
> N is the number of gc allocated blocks.
> 
> It seems to me that a better approach would be to ditch that strategy
> altogether, let ~= have predictable behavior even when the array is
> non-unique, and use a /different/ strategy for incremental array
> building. I would suggest something as simple as:
> 
>     struct Buffer!(T)
>     {
>         private T[] array;
>         private uint reserved;
>         void opCatAssign() {...}
>         T[] toArray() {...}
>         const(T)[] toCArray() {...}
>         invariant(T)[] toIArray() {...}
>         /*etc*/
>     }
> 
> would do the job. Then instead of doing
> 
>     void buildString()
>     {
>         string r;
>         r ~= something;
>         r ~= somethingElse;
>         return r;
>     }
> 
> you would do
> 
>     void buildString()
>     {
>         Buffer!(char) r;
>         r ~= something;
>         r ~= somethingElse;
>         return r.toIArray;
>     }
> 
> toIArray() would have to call either assumeUnique or idup under the
> hood. We could argue about which was the most sensible choice. But
> either way, you saved a lot of CPU time because you no longer have to
> decide at runtime whether or not this is a resizable buffer - that's
> now something you know at compile time. And even when you do know
> that, finding the reserved size is O(1), not O(log N). It's gotta be a
> winner all round.

Is it possible for you to actually benchmark the performance of the two
approaches? (We don't know what else happens behind the scenes, so
analysis alone isn't enough)

I don't think Walter would be able to turn down your approach if you have
evidence to back it up.