Major performance problem with std.array.front()
monarch_dodra
monarchdodra at gmail.com
Sun Mar 9 00:32:08 PST 2014
On Saturday, 8 March 2014 at 20:05:36 UTC, Andrei Alexandrescu
wrote:
>> The current approach is a cut above treating strings as arrays
>> of bytes
>> for some languages, and still utterly broken for others. If I'm
>> operating on a right to left language like Hebrew, what would
>> I expect
>> the result to be from something like countUntil?
>
> The entire string processing paraphernalia is left to right. I
> figure RTL languages are under-supported, but
> s.retro.countUntil comes to mind.
>
> Andrei
I'm pretty sure that all string operations are actually "front to
back". If I recall correctly, evenlanguages that "read" right to
left, are stored in a front to back manner: EG: string[0] would
be the right-most character. Is is only a question of "display",
and changes nothing to the code. As for "countUntil", it would
still work perfectly fine, as a RTL reader would expect the
counting to start at the "begining" eg: the "Right" side.
I'm pretty confident RTL is 100% supported. The only issue is the
"front"/"left" abiguity, and the only one I know of is the oddly
named "stripLeft" function, which actually does a "stripFront"
anyways.
So I wouldn't worry about RTL.
But as mentioned, it is languages like indian, that have complex
graphemes, or languages with accentuated characters, eg, most
europeans ones, that can have problems, such as canFind("cassé",
'e').
On topic, I think D's implicit default decode to dchar is
*infinity* times better than C++'s char-based strings. While
imperfect in terms of grapheme, it was still a design decision
made of win.
I'd be tempted to not ask "how do we back out", but rather, "how
can we take this further"? I'd love to ditch the whole
"char"/"dchar" thing altogether, and work with graphemes. But
that would be massive involvement.
More information about the Digitalmars-d
mailing list