Random string samples & unicode - Reprise

Jonathan M Davis jmdavisprog at gmail.com
Sun Sep 12 19:34:09 PDT 2010


On Sunday 12 September 2010 19:15:10 dsimcha wrote:
> == Quote from bearophile (bearophileHUGS at lycos.com)'s article
> 
> > Jonathan M Davis:
> > > Well, I don't think that I've ever seen a program that did that sort of
> > > thing.
> > 
> > It's common Python code (and maybe in future it will be common D2 code).
> > In
> 
> another answer I have given few examples to Andrei.
> 
> > > If your string processing doesn't require random access, then you
> > > avoid the problem, but as long as it needs random access, you're pretty
> > > much stuck.
> > 
> > I understand, this is probably the answer I was looking for, thank you
> > :-) Bye,
> > bearophile
> 
> I think what we need here is an AsciiString type.  Such a type would be a
> thin wrapper over char[], or maybe immutable(char)[] for added safety.  On
> construction it would enforce that the underlying string does not contain
> any multiple byte characters.  It would only allow appending of chars, not
> wchars or dchars.  If you appended a regular to it, it would throw if the
> appended string contained any characters that couldn't be represented in a
> single byte.  It would be a random access range of chars with lvalue
> elements, and would provide a way of documenting the assumption that
> you're only working with ASCII, and a mechanism for verifying this
> assumption at runtime.

It's not necessarily a bad idea, but I'm not sure that we want to encourage code 
that assumes ASCII. It's far too easy for English-speaking programmers to end up 
making that assumption in their code and then they run into problems later when 
they unexpectedly end up with unicode characters in their input, or they have to 
change their code to work with unicode. I'm inclined to force the issue and keep 
the status quo that _all_ strings in D are unicode of some variety. There's far 
too much code out there which is not unicode compliant when it should be.

- Jonathan m Davis


More information about the Digitalmars-d mailing list