std.stringbuffer

Sean Kelly sean at invisibleduck.org
Wed Apr 30 07:25:45 PDT 2008


== Quote from Janice Caron (caron800 at googlemail.com)'s article
> 2008/4/30 Me Here <p9e883002 at sneakemail.com>:
> >     char[] a = ...2000 chars from somewhere.
> >
> >     char[] field1 = a[ 312 .. 357 ];
> >     field1.toUpper();
> I've kind of lost track of the number of times I've said this in
> recent days, but...
> You cannot uppercase in place, because for any given dchar, c, the
> number of UTF-8 bytes required to express c may be different from the
> number of UTF-8 bytes required to express toupper(c).
> If any of you have plans to uppercase or lowercase UTF-8 in place,
> forget that now. It just ain't possible. (You can uppercase ASCII,
> UTF-16, or UTF-32 in place. But not UTF-8, and char[], by definition,
> is UTF-8).

In all fairness, you can uppercase UTF-8 in place so long as none of
the characters within the string require a multi-byte capital.  Thus
one questionable strategy would be to uppercase in place until the
first multibyte conversion is required.  The obvious downside being
that the original buffer may end up partially capitalized, with the
fully capitalized result returned in a new buffer.  I'm sure people
processing ASCII text would love this, but I can see it causing
problems elsewhere.


Sean



More information about the Digitalmars-d mailing list