To Walter, about char[] initialization by FF

Serg Kovrov user at domain.invalid
Mon Jul 31 04:35:51 PDT 2006


* Oskar Linde:
> Serg Kovrov wrote:
>> * Oskar Linde:
>>> Having char[].length return something other than the actual number
>>> of char-units would break it's array semantics.
>>
>> Yes, I see. Thats why I do not like much char[] as substitute for string
>> type.
>>
>>> It is actually not very often that you need to count the number
>>> of characters as opposed to the number of (UTF-8) code units.
>>
>> Why not use separate properties for that?
>>
>>> Counting the number of characters is also a rather expensive
>>> operation. 
>>
>> Indeed. Store once as property (and update as needed) is better than 
>> calculate it each time you need it.
> 
> The question is, how often do you need it? Especially if you are not 
> indexing by character.
> 
>>> All the ordinary operations (searching, slicing, concatenation, 
>>> sub-string  search, etc) operate on code units rather than
>>> characters.
>>
>> Yes that's tough one. If you want to slice an array - use array unit's 
>> count for that. But if you want to slice a *string* (substring, 
>> search, etc) - use character's count for that.
> 
> Why? Code unit indices will work equally well for substrings, searching 
> etc.
> 
>> Maybe there should be interchangeable types - string and char[]. For 
>> different length, slice, find, etc. behaviors? I mean it could be same 
>> actual type, but different contexts for properties.
> 
> Indexing an UTF-8 encoded string by character rather than code unit is 
> expensive in either time or memory. If you for some reason need 
> character indexing, use a dchar[].
> 
>> And besides, string as opposite to char[] is more pleasant for my eyes =)
> 
> There is always alias.

You've got some valid points, I just showed mine.



More information about the Digitalmars-d mailing list