std.locale

Georg Wrede georg.wrede at iki.fi
Sun Mar 1 23:34:32 PST 2009


Andrei Alexandrescu wrote:
> Georg Wrede wrote:
>> Walter Bright wrote:
>>> Andrei Alexandrescu wrote:
>>>> There will be a global reference to a Locale class, e.g. 
>>>> defaultLocale. By default the reference will be null, implying the C 
>>>> locale should be in effect. Applications can assign to it as they 
>>>> find fit, and also pass around multiple locale variables.
>>>
>>> I disagree with being able to assign to the global defaultLocale. 
>>> This is going to cause endless problems. Just one is that any 
>>> function that uses locale can no longer be pure. defaultLocale should 
>>> be immutable.
>>
>> The two programs that are most "locale aware" are usually spread 
>> sheets and word processors.
>>
>> It is usual that the user needs to write, say, in Swedish or in 
>> Russian, while in a Finnish setting. Or that one wants to use a 
>> decimal separator other than what is "proper" for the country.
>>
>> For example, a lot of people use "." instead of the official "," in 
>> Finland, and many use time as "18:23" instead of "18.23".
>>
>> For this purpose, these programs let the users define these any way 
>> they want.
> 
> That's exactly what my proposal is doing. People can start with the 
> defaults of the Finnish locale and then overwrite whichever parts they 
> want.

 From Java.util.class.locale (j2se/1.4.2): "A Locale object represents a 
specific geographical, political, or cultural region."

Nice. If those three were orthogonal, then you'd choose each once and be 
done with it. Unfortunately, they blend. And they blend in a different 
way in every area. That creates "continuums" of needs for settings, and 
these can't really be predicted easily.

A GUI user can rely on the settings been made at OS install by himself 
or the local vendor. But the console is different. (See below.)

>> I think the notion of locales is, slowly but steadily, going away.
> 
> Do you have any data backing this up?

For instance, in the old days, the operating system used to define the 
variable LC_LOCAL for the user. It signified the locale, usually the 
user's country.

Today, I see no such thing. The only variables related to such are for 
the GUI:

LANG=en_US.UTF-8
GDM_LANG=en_US.UTF-8

One is the console input language and the other is the GUI input 
language. No locale stuff anywhere.

>> It was a nice idea at the time, but with two problems: users don't use 
>> it, and programmers don't use it.
> 
> Is it because it hasn't been properly packaged?

No. Imagine for a moment that we had a Perfect Locale Implementation 
(which I say is not even possible, but still).

If a programmer wanted to use locale dependent printing, then he'd have 
to get familiar with all the possible ways his string may get printed if 
someone uses his program in a far away country. And there are a few 
different ways, believe me.

Would you imagine anybody actually bothering to do that? Would you?? So 
what the programmer does, is, he prints things the way he wants, and 
caters only to the specific things he feels he needs to. And creates a 
solution that behaves *predictably*, from his point of view.

He may want folks in France and Finland to use his program. And since he 
doesn't write the UI strings in any other language, the program will be 
unusable to folks in Afghanistan anyway.

Or he writes an English UI, whereupon people accept that it may not 
cater for all kinds of exotic needs.

>> Of course, eventually we will want to "do something" about this. But 
>> that should be left to the day when real issues are all sorted out in 
>> D. This is a non-urgent, low-priority thing.

Had there been any need for locales, believe me, the "foreigners" in 
this NG would have asked for it.

> I guess. Now please tell me how I print arrays in D.

Think about it for a moment. We have two kinds of programs, those 
written for the console, and those written for a GUI. It's natural for 
the GUI programs to be locale aware, but with the console apps, it 
simply is not possible to do properly. I'll explain, but first:

Let's split this into two separate issues, the console and the GUI.

The GUI is aware of your preferences.
You don't use writefln with the GUI.
You use the GUI API for any I/O, right?

Now, wouldn't it be natural to assume that the GUI API takes care of all 
of this? Print a date, and it prints it with the user's preferred 
format. *The same with your array*.


And then let's look at the console.

A proper internationalisation would mean that the Chinese could use the 
console, and all character mode apps in Chinese. Problem is, there 
simply aren't enough pixels on many consoles to render the Chinese 
character set.

So we're off track already. And with the ubiquitous GUIs around, people 
are increasingly accepting that a GUI is for nationalised stuff, and the 
console is for "technical" stuff.

Haven't you noticed: in the last decade it has become all the more 
evident that the reason to write a non-GUI app, is very specifically 
just to get rid of all kinds of hassles, and simply concentrate on what 
the program is supposed to do!

(You know, a few years ago we had a major conversation here about 
whether non-ASCII variable names should be accepted in D. The end result 
is, yes. (I just tried it.) Now, how can an international team cowork on 
a project where variable names are written so the other folks can't even 
type them with their keyboards??? -- All very nice, but no cigar. That's 
about as smart as letting people define *unlimited* length variable names!)


*** How to print arrays ***

You print arrays in a predictable and expected way.

D array printing is for non-GUI stuff. Hence, you use the C locale, period.

A matematician seriously doesn't want his arrays to have commas instead 
of decimal points. He sure as heck doesn't want the numbers to all of a 
sudden turn to Klingon like hieroglyphs just because he is showing his 
results in an overseas seminar, on the local computer!!!!!


And what about the programmer who wants his array to go into another 
program? What do you think happens to parsing when the decimal point is 
suddenly a comma??

We've had Walter make nice features to D that were laborious to create, 
only to see nobody use them. It's happened, ask him. *Now* is not the 
time to do that again.



More information about the Digitalmars-d mailing list