D array expansion and non-deterministic re-allocation

Steven Schveighoffer schveiguy at yahoo.com
Tue Nov 24 19:16:36 PST 2009


On Tue, 24 Nov 2009 17:35:43 -0500, Bartosz Milewski  
<bartosz-nospam at relisoft.com> wrote:

> Steven Schveighoffer Wrote:
>
>> On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski
>> <bartosz-nospam at relisoft.com> wrote:
>>
>> > Steven Schveighoffer Wrote:
>> >> > Imagine you're reviewing this code written by a relative newbee:
>> >> >
>> >> > char[] quote(char[] word) {
>> >> >   word.length += 2;
>> >> >   word[length-1] = '"';
>> >> >   for(int i = word.length-2; i > 0; --i)
>> >> >     word[i] = word[i - 1];
>> >> >   word[0] = '"';
>> >> >   return word;
>> >> > }
>> >> >
>> >> > void storeFoos(char[] buffer, string pattern, Container c)
>> >> > {
>> >> > 	char[] res = find(buffer, pattern);
>> >> > 	c.store(quote(res[0..pattern.len]);
>> >> > }
>> >> >
>> >> > Is this buggy? If so, how and when will the bug manifest itself?
>> >>
>> >> In my mind it is not buggy.  If quote or storeFoos is not meant to be
>> >> able
>> >> to molest the data, it should require a const(char)[].  If this was  
>> D1
>> >> code, I'd say it depends on the purpose of storeFoos, if storeFoos is
>> >> expecting to own the input buffer, it's not buggy.  If it's not, then
>> >> storeFoos is buggy (in all cases).  But we aren't talking about D1.
>> > Yes, a lot depends on what's expected of storeFoos. Suppose that the
>> > surrounding code expects the buffer not to be modified. The newbee
>> > didn't want to idup the buffer because in most buffers the pattern is
>> > not found.
>>
>> Then he forgot the const decoration for the parameter.  No duping is
>> necessary:
>>
>> void storeFoos(const(char)[] buffer, string pattern, Container c)
>> {
>>     ... // will result in compiler errors for given quote function, fix
>> that.
>> }
>
> Well, he tried that, but then his quote function wouldn't work and he  
> needs it in other places as well. You know how they are ;-). Besides he  
> thinks "const" is for wimps. He won't stop arguing until you prove to  
> him that there is an actual bug in his code.

I think this warrants more code reviews of this particular developer's  
code :)

Bottom line: if a function isn't supposed to change the buffer, the  
signature should be const for that parameter.  It's one of the principles  
of const, and why it's in D2 in the first place.  I'd explain to the coder  
that he is wrong to expect that modifying a buffer in a function that  
isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
or else I don't have time to deal with people who insist on being right  
when they are not).

BTW, in my experience, the newbie expectaction of ~= is usually that it  
modifies the original even when it doesn't, not the other way around.

>
>> >
>> >> > Or another story. An enthusiastic programmer comes to you with a  
>> very
>> >> > clever rewrite of a function. This is the original:
>> >> >
>> >> > T[] duplicate(T)(T[] src) {
>> >> >    static if (is(T == invariant))
>> >> >       return src.idup;
>> >> >    else
>> >> >       return src.dup;
>> >> > }

>> >> >
>> >> > This is his improved version:
>> >> >
>> >> > T[] duplicate(T)(T[] src) {
>> >> >    T tmp = src[$-1];
>> >> >    src.length -= 1;
>> >> >    src ~= tmp;
>> >> >    return src;
>> >> > }
>> >> >
>> >> > No need for static if! The extension is bound to re-allocate  
>> because
>> >> of
>> >> > stomping rules. Will this always work, or is there a gotcha?
>> >>
>> >> He forgot to check for length 0.
>> >>
>> >
>> > That he did. So now he goes back to the drawing board and adds an  
>> early
>> > return for length 0. Is his code correct now?
>>
>> Yes.
>>
>> I have a feeling you have some trick up your sleeve that you think makes
>> it incorrect :)
>
> Well, I was wondering if it could be rewritten as:
> T[] duplicate(T)(T[] src) {
>     assert(src.length != 0);
>     T tmp = src[$-1];
>     src.length -= 1;
>     src.length += 1;
>     src[$-1] = tmp;
>     return src;
> }
> and whether it's okay for the optimizer to do some simple arithmetic  
> optimization.

First, this code isn't equivalent to the original, because it doesn't work  
on const and immutable arrays.
Second, the optimizer cannot do simple math optimization on src.length -=  
1 and src.length += 1 because length is a property, not a field.  Setting  
the length calls a function, which cannot be optimized out.

I also thought of some problems with the original code with static if's.   
Assuming that version, why must this code fail?:

class C {}
const(C)[] arr;
const(C)[] arr2 = arr.duplicate();

Oops, just tried it and no problems!  I'll file a bug if it still happens  
on the latest compiler.

-Steve



More information about the Digitalmars-d mailing list