To Walter, about char[] initialization by FF
Thomas Kuehne
thomas-dloop at kuehne.cn
Mon Jul 31 12:33:35 PDT 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Oskar Linde schrieb am 2006-07-31:
> Serg Kovrov wrote:
>> For example,
>> char[] str = "????";
>> word "test" in russian - 4 cyrillic characters, would give you
>> str.length 8, which make no use of this length property if you not sure
>> that string is latin characters only.
>
> It is actually not very often that you need to count the number of
> characters as opposed to the number of (UTF-8) code units. Counting the
> number of characters is also a rather expensive operation. All the
> ordinary operations (searching, slicing, concatenation, sub-string
> search, etc) operate on code units rather than characters.
>
> It is easy to implement your own character count though:
>
> size_t count(char[] arr) {
> size_t c = 0;
> foreach(dchar c;arr)
> c++;
> return c;
> }
>
> assert("????".count() == 4);
>
> Also note that:
>
> assert("????"d.length == 4);
I hate to be pedantic but dchar[] can only be used to count the code
points - not the characters. A "character" can be composed by more than
one code point/dchar. This feature is frequent used for accents, marks
and some Asian scripts.
- -> http://www.unicode.org
Thomas
-----BEGIN PGP SIGNATURE-----
iD8DBQFEzmhrLK5blCcjpWoRAnJhAJ0VKD2sD++PkR0hnFfGIAgFxn8OGgCeLg0Y
mp2vyHbFrwExwr3h6/etjWc=
=9RLJ
-----END PGP SIGNATURE-----
More information about the Digitalmars-d
mailing list