To Walter, about char[] initialization by FF
Oskar Linde
oskar.lindeREM at OVEgmail.com
Mon Jul 31 02:50:29 PDT 2006
Serg Kovrov wrote:
> Maybe I missed the point here, correct me if I misunderstood.
You have understood correctly.
> This is how I see the problem with char[] as utf-8 *string*. The length
> of array of chars is not always count of characters, but rather size of
> array in bytes. Which makes no sense for me. For that purpose I would
> like to see separate properties.
Having char[].length return something other than the actual number of
char-units would break it's array semantics.
> For example,
> char[] str = "тест";
> word "test" in russian - 4 cyrillic characters, would give you
> str.length 8, which make no use of this length property if you not sure
> that string is latin characters only.
It is actually not very often that you need to count the number of
characters as opposed to the number of (UTF-8) code units. Counting the
number of characters is also a rather expensive operation. All the
ordinary operations (searching, slicing, concatenation, sub-string
search, etc) operate on code units rather than characters.
It is easy to implement your own character count though:
size_t count(char[] arr) {
size_t c = 0;
foreach(dchar c;arr)
c++;
return c;
}
assert("тест".count() == 4);
Also note that:
assert("тест"d.length == 4);
/Oskar
More information about the Digitalmars-d
mailing list