Internationalization vs. Unicode

H. S. Teoh hsteoh at quickfur.ath.cx
Fri Apr 26 15:58:31 PDT 2013


On Fri, Apr 26, 2013 at 06:09:48PM -0400, Tyro[17] wrote:
> There are myriad encoding schemes. D natively supports Unicode and
> provide functionality via phobos. A byproduct of this is that since
> ASCII is a subset of Unicode, it also natively support ASCII. This
> is a plus for the language but what of the other encoding schemes?
> What library functionality is provided to manipulate or convert
> between those encoding schemes and Unicode?
> 
> I have a need to convert from CKJ encoding (presently EUC-JP and
> Shift-JIS) to Unicode. How do I accomplish this using D/Phobos? Is
> there a standalone library that does this? If so, can someone point
> me to it? If not, is there planned functionality for inclusion in
> phobos or am I doomed to resorting to Java or some other language to
> accomplish this task (or at least until I'm educated enough to do it
> myself)?
[...]

If you're using a Posix system, you could look into the 'recode' utility
to convert from those legacy formats to Unicode before using your
program on them. You may be able to figure out how to do it by looking
at recode's source code. But AFAIK there is no way to do it in D
currently.

Maybe someone should invent std.recode and submit it for inclusion into
Phobos. ;-)


T

-- 
People tell me that I'm paranoid, but they're just out to get me.


More information about the Digitalmars-d-learn mailing list