std.string.chomp error

Tue Aug 10 02:34:33 PDT 2010

On Tue, 10 Aug 2010 01:48:17 -0700, Jonathan M Davis wrote:

> On Tuesday 10 August 2010 00:30:37 Lars T. Kyllingstad wrote:
>> No, using 'is' won't work.  Check this out:
>> 
>>   int[] a;
>>   assert (a == null);
>>   assert (a is null);
>> 
>>   a = new int[10];
>>   a.length = 0;
>>   assert (a == null);
>>   assert (a !is null);
>> 
>> The thing is, '==' tests whether two arrays are equal, that is, that
>> they are equally long and that their elements are equal.  Any empty
>> array is equal to null -- in fact, in this context 'null' is just a way
>> of denoting an empty array that doesn't point to any particular memory
>> block (i.e. hasn't been initialised yet).
>> 
>>   // This is what '==' does
>>   bool mimicEquals(int[] a, int[] b)
>>   {
>>       if (a.length != b.length) return false; foreach (i; 0 ..
>>       a.length) if (a[i] != b[i]) return false; return true;
>>   }
>> 
>> 'is', on the other hand, tests whether two arrays are identical, i.e.
>> that they have the same length and *refer to the same piece of memory*.
>> 
>>   // This is (sort of) what 'is' does
>>   bool mimicIs(int[] a, int[] b)
>>   {
>>      return (a.ptr == b.ptr  &&  a.length == b.length);
>>   }
>> 
>> -Lars
> 
> Actually, it looks to me that that's an argument for using is for
> checking for null rather than ==, since == isn't really going to tell
> you. The fact that == doesn't care about whether an array is null makes
> it not work for checking for whether an array is null.

I guess it depends on what behaviour you're after.  In the present case, 
if you want chomp(a, null) and chomp(a, "") to do the same thing, then 
you should use '=='.  If you want chomp(a, "") to simply do nothing, use 
'is'.  I just figured that the former was the desired behaviour here.  If 
it isn't, I agree with you. :)

> 1. As I understand it, using is instead of == is for all references, not
> just arrays and their bizarre pseudo-null state. Using is with a class
> will avoid calling opEquals() and does exactly what you want when
> checking whether a class reference is null.

Fun fact: Actually, 'is' works for any type.

  assert (1 is 1);

As I've understood it, 'a is b' is true if the variables a and b contain 
the exact same bits.  If a and b are value types, this must mean they 
have the same value, and if they are references (including arrays), it 
means they refer to the same data.

> 2. For arrays, if you want to check whether it really is null, then you
> _must_ use is, because == obviously isn't going to tell you. It'll just
> lump empty arrays in with null ones. For instance, if you want to check
> that an array has never been initialized or that it has been set to null
> and never set to something else, then you need to use is.
> 
> 3. On the other hand, if what you really care about is checking whether
> an array has any elements and you don't care about whether it's null or
> not, then the empty function/property would be the better way to go.
> It's quite explicit, and it's more generic, doing things the way that
> ranges are done.

I totally agree with you.  Lately, I have started using "empty" (as well 
as the other range primitives) for arrays myself.  I just disagreed that 
'is' would produce what I perceived to be the right behaviour for the 
function in question.  But that perception may well be wrong. ;)

> Personally, I think that the way that null is handled with arrays and
> associative arrays is a poor design choice (if they're null, they should
> be null until you assign to them with new rather than this whole null
> but not null nonsense), but we're stuck with it I guess.

There, I don't agree with you.  Arrays are a sort of pseudo-reference 
type, so I don't mind 'null' being a sort of pseudo-null in that 
context.  Actually, I find it to be quite elegant.  It's a matter of 
taste, I guess.

-Lars