Character set conversions

Daniel Gibson metalcaedes at gmail.com
Mon May 30 13:35:48 PDT 2011


Am 30.05.2011 22:20, schrieb Simen Kjaeraas:
> On Mon, 30 May 2011 19:57:32 +0200, Jérôme M. Berger <jeberger at free.fr>
> wrote:
> 
>>     Fun fact about Excel generated CSV files: quite apart from encoding
>> issues, the separator used between cells depends on the locale: for
>> example, in English locales it uses a coma but in French locales it
>> uses a semicolon...
>>
>>     Just thought I'd point it out in case you did not know.
> 
> Fun? Gods, it's the most horrible idea I've witnessed in computing.
> If only they'd call it something other than CSV, at least - Comma Separated
> Values separated by semicolons? WTF?
> And the fantastic joy of opening one of those abominations in some other
> program... *shiver*
> 

CSV in Excel is totally misleading anyway. At least in the German
Version, if you want to import a CSV file, the standard seperator is
tab, not comma.. If you use File->Open this is all you can get,
importing with custom seperators is hidden somewhere else IIRC.
(This refers to Office XP, dunno if newer versions are better in this
regard.)

In plain C (at least on Linux) you have fun locale-dependent in/output
as well: printf and scanf are locale dependent, so if you use sprintf
to generate a string you'll write into a file (or fprintf directly) with
one locale, reading it with scanf functions with another locale will fail.
Pretty fucking stupid IMHO.
This was/is(?) a bug in GtkRadiant, a level editor for Quake like games,
which uses printf or something to write the map files. The map compiler
will reject them if decimals use a , instead of a . and stuff like that.
(The workaround is to always use the standard LOCALE, i.e. "LC_ALL=C
gtkradiant" to start it).


Cheers,
- Daniel


More information about the Digitalmars-d mailing list