Rename std.ctype to std.ascii?
Jonathan M Davis
jmdavisProg at gmx.com
Tue Jun 14 12:41:33 PDT 2011
On 2011-06-14 11:53, Jouko Koski wrote:
> "Jonathan M Davis" <jmdavisProg at gmx.com> wrote:
> > So, yes I understood. It's just that as far as I can tell, locales don't
>
> matter if you're completely restricting yourself to ASCII like std.ctype
> does.
>
> I would not consider it being good idea to include this kind of ascii-only
> utilities in the standard-ish library. It might be best to rename the
> module to std.ascii_for_insular_yankees_others_keep_away so that nobody
> would use it by accident. This way the name would also remind us about the
> historical terms which were used quarter of a century ago when ascii-only
> <ctype.h> utilities were first suggested to the intenational C
> standardization committee.
For some classes of operations, it makes perfect sense to be checking for
ASCII characters only. For others, it's just people not worrying about
internationalization like they should be. For instance, format strings don't
care about unicode as far as their escape sequences go. %a, %d, etc. are all
pure ASCII. So, worrying about unicode with them just wouldn't make sense. In
most cases, isDigit working on the arabic numerals 0 through 9 is _exactly_
what people want and need. But if you were to try and make it more unicode-
friendly, would Greek or Chinese numbers count as digits? Maybe, maybe not. It
gets much more complicated. In some cases, all you care about with isUpper or
toUpper is ASCII. In others, you want it to deal with unicode (and probably
locales as well) properly.
std.ctype/std.ascii deals with ASCII for those situations where you really do
only care about ASCII. It deals with unicode characters, but it returns false
for everything with them which returns a bool, and it never tries to change
their case. std.uni actually deals with unicode and worries about things like
whether a unicode character is uppercase or not.
They're for two different use cases. Most of Phobos should be dealing with
unicode (e.g. pretty much everything in std.string should be using the std.uni
functions rather than the std.ascii functions if there's a function which is
in both), but there are cases where unicode doesn't matter, and you might as
well have the efficiency available of just dealing with ASCII. Ultimately,
it's up to the programmer to do the right thing.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list