constness for arrays

Thu Jul 20 04:40:36 PDT 2006

Andrew Fedoniouk wrote:
> "Reiner Pope" <reiner.pope at gmail.com> wrote in message 
> news:e9kunq$qli$1 at digitaldaemon.com...
>> You get the speed gains from avoiding all unnecessary duplications, a feat 
>> which simple (a la C++) static const-checking can't achieve. Imagine that 
>> we had a static const-checking system in D:
>>
>> const char[] tolower(const char[] input)
>> // the input must be const, because we agree with CoW, so we won't change 
>> it
>> // Because of below, we also declare the output of the function const
>> {
>>   // do some stuff
>>   if ( a write is necessary )
>>   { // copy it into another variable, since we can't change input (it's 
>> const)
>>   }
>>   return something;
>> // This something could possibly be input, so it also needs to be declared 
>> const. So we go back and make the return value of the function also a 
>> const.
>> }
>>
>> // Now, since the return value is const, we *must* dup it whenever we call 
>> it. This is *very* inefficient if we own the string, because we get two 
>> unnecessary dups. This is a big price to pay just to keep static 
>> const-checking.
>>
>>
>>> c) can be implemented now by defining:
>>> struct vector
>>> {
>>>     bool readonly;
>>>     T*  data;
>>>     uint length;
>>> }
>> Yes and no. It can be implemented like that because that would effectively 
>> copy exactly what an array does already, but a) it takes up more memory 
>> than what xs0 proposed, and b) it isn't supported natively by the 
>> language's arrays, so it is less likely to be used.
>>
> 
> propsed readonly solves one particular pretty narrow case of COW
> (only for arrays and only in functions aware about this flag)
You have to be aware of CoW if you are writing a CoW function. It's like 
saying that the opIndexAssign property of arrays is limited because it 
can only be used by the functions that know about it. I see this 
proposal as an alternative to C++-style const, and with regards to 
functions being aware of the features, xs0's solution is better because 
it avoids const propogation throughout the code

> 
> C++ has better and more universal mechanism for this.
> 
> inline string &
>     string::operator= ( const string &s )
>   {
>     release_data();
>     set_data ( s.data );
>     return *this;
>   }
> 
This appears to be copying the contents of s into this. In D terms, this 
is a duplication, which is the runtime costs we are trying to avoid.

> inline string & string::operator += ( const string &s )
>   {
>     mutate(*this);
>     resize( length() + s.length() );
>     .....
>     return *this;
>   }
> 

The other point to make is that this seems not to be a C++ feature, but 
rather a library feature. I'm probably not understanding your examples, 
but can you, say, provide C++ code to match the following D code's 
functionality while avoiding unnecessary duplicates _and having const 
safety_:

char[] foo = "foo";
foo = tolower(toupper(foo));

I don't see how you can manage that with static const-checking. Please 
explain, and maybe then I can understand how the C++ solution is 'better'.

> I beleive that COW arrays (strings in particular) if they needed cannot
> be made without operator= in structures in D.
This seems to be a tangential issue. xs0's solution appears to work, and 
you haven't outlined a technical reason for it not working. If Walter 
integrates it into D, then that isn't going to cause any problems.

> Reference counting cannot be made in D with the same elegancy as in C++.
I don't see why it can't, but ignoring that, I also don't see why we 
need ref-counting for CoW strings. Doesn't mark-and-sweep manage it better?

> But in pure GC world COW strings are not used.
> Strings in Java, C#, JavaScript, etc. are immutable character ranges -
But Java, C# and JavaScript are not fast languages, so the importance of 
  fast string processing is largely diminished. However, D's string 
processing capabilities are good, and since it is possible to keep them, 
why shouldn't we?
> string as a type simply has no such things as str[i] = 'o';
> There are strong reasons for that.
I'm not aware of them. From my experience in Java and C# it is extremely 
cumbersome to process strings, with all the calls to foo.substring(0, 
2); and so on. The other downside is that the processing is *slow* *as*.

Cheers,

Reiner