Unicode handling comparison

H. S. Teoh hsteoh at quickfur.ath.cx
Wed Nov 27 09:40:16 PST 2013


On Wed, Nov 27, 2013 at 06:22:41PM +0100, Jakob Ovrum wrote:
> On Wednesday, 27 November 2013 at 16:15:53 UTC, Wyatt wrote:
> >I don't remember if it was brought up before, but this makes me
> >wonder if something like an i18nString should exist for cases
> >where it IS important.  Making i18n stuff as simple as it looks
> >like it "should" be has merit, IMO.  (Maybe there's even room for
> >a std.string.i18n submodule?)
> >
> >-Wyatt
> 
> What would it do that std.uni doesn't already?
> 
> i18nString sounds like a range of graphemes to me.

Maybe it should be called graphemeString?

I'm not sure what this has to do with i18n, though. Properly done i18n
should use Unicode line-breaking algorithms and other such standardized
functions, rather than manipulating graphemes directly (which fails to
take into account double-width characters, language-specific
decomposition rules, and many other gotchas, not to mention
poorly-performing). AFAIK std.uni already provides a way to extract
graphemes when you need it (e.g., for rendering fonts), so there's
really no reason to default to graphemeString everywhere in your
program. *That* is a sign of poorly written code, IMNSHO.


> I would like a convenient function in std.uni to get such a range of
> graphemes from a range of points, but I wouldn't want to elevate it to
> any particular status; that would be a knee-jerk reaction. D's
> granularity when it comes to Unicode is because there is an
> appropriate level of representation for each domain. Shoe-horning
> everything into a range of graphemes is something we should avoid.
> 
> In D, we can write code that is both Unicode-correct and highly
> performant, while still being simple and pleasant to read. To write
> such code, one must have a modicum of understanding of how Unicode
> works (in order to choose the right tools from the toolbox), but I
> think it's a novel compromise.

Agreed.


T

-- 
MASM = Mana Ada Sistem, Man!


More information about the Digitalmars-d mailing list