The Case Against Autodecode

sarn via Digitalmars-d digitalmars-d at puremagic.com
Tue May 17 16:33:47 PDT 2016


On Tuesday, 17 May 2016 at 09:53:17 UTC, Kagamin wrote:
> With UTF-8 problems happened on a massive scale in LAMP setups: 
> mysql used latin1 as a default encoding and almost everything 
> worked fine.

^ latin-1 with Swedish collation rules.
And even if you set the encoding to "utf8", almost everything 
works fine until you discover that you need to set the encoding 
to "utf8mb4" to get real utf8.  Also, MySQL has per-connection 
character encoding settings, so even if your application is 
properly set up to use utf8, you can break things by accidentally 
connecting with a client using the default pretty-much-latin1 
encoding.  With MySQL's "silently ram the square peg into the 
round hole" design philosophy, this can cause data corruption.

But, of course, almost everything works fine.

Just some examples of why broken utf8 exists (and some venting of 
MySQL trauma).


More information about the Digitalmars-d mailing list