"" gives an empty string, while "".idup gives null

Dmitry Olshansky dmitry.olsh at gmail.com
Wed Aug 3 12:01:11 PDT 2011


On 03.08.2011 22:26, simendsjo wrote:
> On 03.08.2011 19:15, Jonathan M Davis wrote:
>>> On 03.08.2011 18:18, Jonathan M Davis wrote:
>>>> On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
>>>>> On 8/3/2011 11:23 PM, simendsjo wrote:
>>>>>> On 03.08.2011 15:49, bearophile wrote:
>>>>>>> simendsjo:
>>>>>>>> void main() {
>>>>>>>> assert(is(typeof("") == typeof("".idup))); // both is
>>>>>>>> immutable(char)[]
>>>>>>>>
>>>>>>>> assert("" !is null);
>>>>>>>> assert("".idup !is null); // fails - s is null. Why?
>>>>>>>> }
>>>>>>>
>>>>>>> I think someone has even suggested to statically forbid "is 
>>>>>>> null" on
>>>>>>> strings :-)
>>>>>>>
>>>>>>> Bye,
>>>>>>> bearophile
>>>>>>
>>>>>> How should I test for null if not with "is null"? There is a 
>>>>>> difference
>>>>>> between null and empty, and avoiding this is not necessarily easy or
>>>>>> even wanted.
>>>>>> I couldn't find anything in the specification stating this 
>>>>>> difference.
>>>>>> So... Is it a bug?
>>>>>
>>>>> This is apparently a bug. Somehow, the idup is clobbering the 
>>>>> pointer.
>>>>> You can see it more clearly here:
>>>>>
>>>>> void main()
>>>>> {
>>>>>
>>>>> assert("".ptr);
>>>>>
>>>>> auto s = "".idup;
>>>>> assert(s.ptr); // boom!
>>>>>
>>>>> }
>>>>
>>>> I don't know if it's a bug or not. The string _was_ duped. assert(s ==
>>>> "") passes. So, as far as equality goes, they're equal, and they don't
>>>> point to the same memory. Now, you'd think that the new string 
>>>> would be
>>>> just empty rather than null, but whether it's a bug or not depends
>>>> exactly on what dup and idup are supposed to do with regards to null.
>>>> It's probably just a side effect of how dup and idup are implemented
>>>> rather than it being planned one way or the other. I don't know if it
>>>> matters or not though. In general, I don't like the conflation of null
>>>> and empty, but is this particular case, you _do_ get a string which is
>>>> equal to the original and which doesn't point to the same memory. 
>>>> So, I
>>>> don't know whether this should be considered a bug or not. It 
>>>> depends on
>>>> what dup and idup are ultimately supposed to do.
>>>>
>>>> - Jonathan M Davis
>>>
>>> I would think it's a bug, but strings doesn't quite behave as regular
>>> references anyway...
>>> But why should dup/idup change the semantics of the array?
>>>
>>> void main() {
>>> // A null string or empty string works as expected
>>> string s1;
>>> assert(s1 is null);
>>> assert(s1.ptr is null);
>>> assert(s1 == ""); // We can check for empty even if it's
>>> null, and it's equal to ""
>>> assert(s1.length == 0); // ...and length even if it's null
>>> s1 = "";
>>> assert(s1 !is null);
>>> assert(s1.ptr !is null);
>>> assert(s1.length == 0);
>>> assert(s1 == "");
>>>
>>> // the same applies to null mutable arrays
>>> char[] s2;
>>> assert(s2 is null);
>>> assert(s2.ptr is null);
>>> assert(s2 == "");
>>> assert(s2.length == 0);
>>> // but with .dup/.idup things is different!
>>> s2 = "".dup;
>>> //assert(s2 !is null); // fails
>>> //assert(s2.ptr !is null); // fails
>>> assert(s2.length == 0); // but... s2 is null..?
>>> assert(s2 == "");
>>> assert(s2 == s1);
>>> }
>>
>> If you look at the spec ( 
>> http://d-programming-language.org/arrays.html ), it
>> says:
>>
>> dup: Cre­ate a dy­namic array of the same size and copy the con­tents 
>> of the
>> array into it.
>>
>> idup: Cre­ate a dy­namic array of the same size and copy the 
>> con­tents of the
>> array into it. The copy is typed as being im­mutable. D 2.0 only
>>
>>
>> This is _exactly_ what dup and idup are doing. You get a new array 
>> with the
>> exact same size and contents. null doesn't factor into it at all. So, 
>> per the
>> spec, there's no bug here at all. dup and idup promise _nothing_ with 
>> regards
>> to null.
>>
>> It may be that it would be better if dup and idup returned an array 
>> which was
>> null if the original was null, and that would also be within the 
>> spec, but
>> what dup and idup do at the moment _does_ follow the spec.
>>
>> So, feel free to file a bug report on it. Maybe it'll get changed, 
>> but the
>> current behavior follows the spec. And given how arrays don't 
>> generally treat
>> empty and null as being different, I wouldn't really expect an array 
>> to stay
>> null if you do _anything_ to it other than simply pass it around or 
>> check its
>> value. In this case, you're creating a new array, and D just doesn't 
>> generally
>> care about null vs empty when it comes to arrays. I wouldn't argue 
>> that that's
>> a good thing (because I don't really think that it is), but because 
>> of that,
>> you can't really expect much to treat null and empty as being 
>> different. And
>> in this particular case, it's not only debatable as to whether it 
>> matters, but
>> the current behavior is completely within the spec.
>>
>> - Jonathan M Davis
>
> Schveighoffer also states it is as designed.
> But it really doesn't behave as one (at least I) would expect.
> So in essence (as bearophile says), "is null" should not be used on 
> arrays.
>
> I was bitten by a bug because of this, and used "" intead of "".idup 
> to avoid this, but given D doesn't distinguish between empty and null 
> arrays, this doesn't feel very safe now..
>
> In the code in question I have a lazy initialized string. The problem 
> is that I would see if it has been initialized, but an empty string is 
> also a valid value. Because I shouldn't check for null, I now have to 
> add another field to the struct to see if the array has been 
> initialized. This feels like a really suboptimal solution.

length works even for "null" arrays and returns 0. Even cleaner way is 
to use std.array.empty:
char[] abc = null;
assert(abc.empty);

So there is no uninitialized arrays, there are just different versions 
of empty slices.

-- 
Dmitry Olshansky



More information about the Digitalmars-d-learn mailing list