D Conference Tango Phobos

Regan Heath regan at netmail.co.nz
Fri Sep 14 08:24:15 PDT 2007


Steven Schveighoffer wrote:
> "Regan Heath" <regan at netmail.co.nz> wrote in message 
> news:fce6je$1qe2$1 at digitalmars.com...
>> Steven Schveighoffer wrote:
>>> To me, I think toString is more clear.  BTW, I think utf-8 has multi-byte 
>>> characters.  If that's the case, then isn't toASCII more appropriate?
>> Are you saying that the object to<whatever> routine should output ASCII 
>> and not UTF-8, 16, or 32?  If so, I doubt the japanese D community would 
>> agree. <g>
> 
> I understand, but first, I KNOW that in my objects that I override toUtf8, 
> I'm outputting only ASCII.  

And therefore UTF-8, or rather a subset of UTF-8 :)

> I'm not saying that all objects in the library 
> do, but I'm guessing that many people do this.

I would agree that most english speaking people, and even those 
languages whose special characters are included in the ASCII set will be 
doing this.  But it's likely many Japanese people aren't (I'm not 
singling japanese out for any reason other than my own ignorance of 
which languages require more than the ASCII character set)

> Then it becomes a question of how to interpret the fact that the name is 
> toUtf8.  Is it reasonable for a Japanese developer to assume that because 
> his locale is set differently, all toUtf8 methods should return strings in 
> Japanese, and to do otherwise is a bug?  The name implies that to me, and so 
> that is why I think it's misleading.  It also implies that it is the 
> responsibility of everyone who overrides that method to take care if i18n. 

The good thing about UTF-8 is that it does not depend on locale.  UTF-8 
in one locale is identical to UTF-8 in another.  It therefore does not 
imply any particular language either.

> Not something I wish to do when I'm probably only going to use toUtf8 as a 
> debugging mechanism :)

I can understand that, and that's the beauty of UTF-8, 16, and 32 they 
can all represent any character anyone could possibly want to use 
independant of locale or language.

Regan



More information about the Digitalmars-d mailing list