C strings - byte, ubyte or char? (Discussion from Bugzilla)
Stewart Gordon
smjg_1998 at yahoo.com
Thu Oct 4 09:17:59 PDT 2007
"Matti Niemenmaa" <see_signature at for.real.address> wrote in message
news:fe2t70$2eka$1 at digitalmars.com...
<snip>
> Good idea. But note that I'm not talking only about C string-processing
> functions: in general, any functions which process strings without regard
> to
> their encoding should use ubytes.
>
> Just about all of std.string are such, for instance.
Looks like I'll have to investigate....
> The Tango situation is
> better, since tango.text.Util is already templated for char/wchar/dchar:
> ubyte
> would need to be added to the mix.
<snip>
> One problem with toStringz is efficiency. Its current implementation of
> performs
> a string concatenation every time. If you know the string is zero
> terminated and
> ASCII (or you just want it to be handled as encoding-agnostic), you should
> just
> be able to pass it through.
I had no idea that the implementation had changed.
> But on second thought, having the cast (or a call to toStringz) be
> necessary
> might be better. If you want UTF-8 to be handled as encoding-agnostic, a
> necessary cast may be a good idea, as it implies you know what you're
> doing.
Why should I care that a function is encoding-agnostic if I know what
encoding my text is in? That sounds to me like suggesting that I should
have to cast class instances explicitly to Object to prove I know that the
function can use objects of any class.
Stewart.
--
My e-mail address is valid but not my primary mailbox. Please keep replies
on the 'group where everybody may benefit.
More information about the Digitalmars-d
mailing list