2.011 invariant question
Sönke Ludwig
ludwig at informatik_dot_uni-luebeck.de
Sat Feb 23 08:04:20 PST 2008
Janice Caron wrote:
> On 23/02/2008, Sönke Ludwig <ludwig at informatik_dot_uni-luebeck.de> wrote:
>> I think that if it's fixed, it has to be done transparently or it my be
>> difficult to explain why s = s ~ "x"; is allowed but s ~= "x"; is not.
>
> You know, the more I think about it, the more I come to the conclusion
> that you're right, and that
>
> x ~= y;
>
> should behave /exactly/ like
>
> x = x ~ y;
>
> and cause a reallocation every time. Incremental string building is
> the usual reason given why this isn't the case, but I think there's
> another way round that problem.
>
> Think about what happens at runtime when x ~= y; is executed. First,
> the compiler has to decide whether or not x is resizable (that is,
> does it point to the start of a resizable block?). Well - how does it
> know that? Answer: it has to ask the garbage collector. OK, so how
> does the garbage collector know that? Well, presumably, the gc has to
> have some sort of map or table, so it can look up the address x, and
> if it's in the table, then the answer is yes. But that lookup
> operation has to be slower that O(1). No matter how you look at it,
> there has to be at least O(log N) CPU time being consumed there, where
> N is the number of gc allocated blocks.
>
> It seems to me that a better approach would be to ditch that strategy
> altogether, let ~= have predictable behavior even when the array is
> non-unique, and use a /different/ strategy for incremental array
> building. I would suggest something as simple as:
>
> struct Buffer!(T)
> {
> private T[] array;
> private uint reserved;
> void opCatAssign() {...}
> T[] toArray() {...}
> const(T)[] toCArray() {...}
> invariant(T)[] toIArray() {...}
> /*etc*/
> }
>
> would do the job. Then instead of doing
>
> void buildString()
> {
> string r;
> r ~= something;
> r ~= somethingElse;
> return r;
> }
>
> you would do
>
> void buildString()
> {
> Buffer!(char) r;
> r ~= something;
> r ~= somethingElse;
> return r.toIArray;
> }
>
> toIArray() would have to call either assumeUnique or idup under the
> hood. We could argue about which was the most sensible choice. But
> either way, you saved a lot of CPU time because you no longer have to
> decide at runtime whether or not this is a resizable buffer - that's
> now something you know at compile time. And even when you do know
> that, finding the reserved size is O(1), not O(log N). It's gotta be a
> winner all round.
>
Sounds like a good approach. The rare cases where the in-place resizing
behavior of the build-in arrays is needed for performance could very
well be handled with a standard Buffer/vector/whatever. To me, it feels a bit
strange to use such an optimization without an explicit reserve()/capacity()
interface anyway... and the lookup performance argument is sound, too.
More information about the Digitalmars-d
mailing list