Inconsitency

Chris wendlec at tcd.ie
Mon Oct 14 01:58:13 PDT 2013


On Sunday, 13 October 2013 at 13:40:21 UTC, Sönke Ludwig wrote:
> Am 13.10.2013 15:25, schrieb nickles:
>> Ok, if my understandig is wrong, how do YOU measure the length 
>> of a string?
>> Do you always use count(), or is there an alternative?
>>
>>
>
> The thing is that even count(), which gives you the number of 
> *code points*, isn't necessarily what is desired - that is, the 
> number of actual display characters. UTF is quite a complex 
> beast and doing any operations on it _correctly_ generally 
> requires a lot of care. If you need to do these kinds of 
> operations, I would highly recommend to read up the basics of 
> UTF and Unicode first (quick overview on Wikipedia: 
> <http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings>).
>
> arr.length is meant to be used in conjunction with array 
> indexing and slicing (arr[...]) and its value is consistent for 
> all string and array types for this purpose.

I recently discovered a bug in my program. If you take the letter 
"é" for example (Linux, Ubuntu 12.04), std.utf.count() returns 1 
and .length returns 2. I needed the length to slice the string at 
a given point. Using .length instead of std.utf.count() fixed the 
bug.


More information about the Digitalmars-d mailing list