pu$�le

strtr strtr at sp.am
Sun Jul 18 06:16:09 PDT 2010


== Quote from Jonathan M Davis (jmdavisprog at gmail.com)'s article
> On Sunday 18 July 2010 04:13:03 bearophile wrote:
> > Jonathan M Davis:
> > > You should pretty much never deal with each individual char or wchar in a
> > > string or wstring. Do the conversion to dchar or dstring if you want to
> > > access individual characters. You can also use std.utf.stride() to
> > > iterate over to the next code unit which starts a code point, but you're
> > > still going to have to make sure that you convert it to a dchar to
> > > process it properly. Otherwise, only ASCII characters will work right
> > > (since they fit in a single code unit). Fortunately, foreach takes care
> > > of all this for is if we specify the element type as dchar.
> >
> > I am starting to think that for safety the foreach on a string has to yield
> > dchars on default, and to yield chars only on request: foreach(c; "hello")
> > => dchars
> > foreach(char c; "hello") => chars
> >
> > Bye,
> > bearophile
> That's probably a good idea, though for people to write safe string code in the
> general case, they're really going to have to understand the differences between
> char, wchar, and dchar as well as what that means for their code. It's just way
> too easy to shoot yourself in the foot once you start trying to manipulate
> single characters, and I don't think that there's really a way to fix that unless
> you forced dchar for everything, which definitely isn' t the D way to do things
> (though IIRC, that's essentially what Java did). Still, this particular case
> might be better off defaulting to dchar since dchar is already handled specially
> in foreach anyhow. My only real problem with that is the fact that while dchar
> is handled specially, it's done with a conversion, and making foreach over a
> string default to dchar instead of char breaks how foreach works normally. It
> seems to me more like a warning would be a better idea. If they really want
> char, they can specify char, but the warning would warn them so that they'd be
> aware of the issue and specify the correct type (be it char or dchar or
> whatever) rather than leaving it blank. That way, foreach retains its normal
> semantics, and the problem is still averted.
> - Jonathan M Davis

I agree with the warning. A good warning would get people to read up on UTF.
And if you really want to have char you'll need to cast:
foreach(cast(char)c; chars)


More information about the Digitalmars-d-learn mailing list