Major performance problem with std.array.front()

Sun Mar 9 04:34:30 PDT 2014

On Sunday, 9 March 2014 at 08:32:09 UTC, monarch_dodra wrote:
> On topic, I think D's implicit default decode to dchar is 
> *infinity* times better than C++'s char-based strings. While 
> imperfect in terms of grapheme, it was still a design decision 
> made of win.
>
> I'd be tempted to not ask "how do we back out", but rather, 
> "how can we take this further"? I'd love to ditch the whole 
> "char"/"dchar" thing altogether, and work with graphemes. But 
> that would be massive involvement.

Why do you think it is better?

Let's be clear here: if you are searching/iterating/comparing by 
code point then your program is either not correct, or no better 
than doing so by code unit. Graphemes don't really fix this 
either.

I think this is the main confusion: the belief that iterating by 
code point has utility.

If you care about normalization then neither by code unit, by 
code point, nor by grapheme are correct (except in certain 
language subsets).

If you don't care about normalization then by code unit is just 
as good as by code point, but you don't need to specialise 
everywhere in Phobos.

AFAIK, there is only one exception, stuff like s.all!(c => c == 
'é'), but as Vladimir correctly points out: (a) by code point, 
this is still broken in the face of normalization, and (b) are 
there any real applications that search a string for a specific 
non-ASCII character?

To those that think the status quo is better, can you give an 
example of a real-life use case that demonstrates this?

I do think it's probably too late to change this, but I think 
there is value in at least getting everyone on the same page.