Making all strings UTF ranges has some risk of WTF
Trass3r
un at known.com
Thu Feb 4 03:25:50 PST 2010
Am 04.02.2010, 04:05 Uhr, schrieb grauzone <none at example.net>:
> Andrei Alexandrescu wrote:
>> What can be done about that? I see a number of solutions:
>> (a) Do not operate the change at all.
>> (b) Operate the change and mention that in range algorithms you should
>> check hasLength and only then use "length" under the assumption that it
>> really means "elements count".
>> (c) Deprecate the name .length for UTF-8 and UTF-16 strings, and
>> define a different name for that. Any other name (codeUnits, codes
>> etc.) would do. The entire point is to not make algorithms believe
>> strings have a .length property.
>> (d) Have std.range define a distinct property called e.g. "count" and
>> then specialize it appropriately. Then change all references to .length
>> in std.algorithm and elsewhere to .count.
>> What would you do? Any ideas are welcome.
>
Definitely against (c)+(d).
> Change the type of string literals from char[] (or whatever the string
> type is in D2) to a wrapper struct defined in object.d:
>
> struct string {
> char[] raw;
> }
>
That sounds like a really reasonable way to me.
More information about the Digitalmars-d
mailing list