Major performance problem with std.array.front()

Vladimir Panteleev vladimir at thecybershadow.net
Fri Mar 7 09:04:29 PST 2014


On Friday, 7 March 2014 at 16:43:30 UTC, Dicebot wrote:
> On Friday, 7 March 2014 at 16:18:06 UTC, Vladimir Panteleev 
> wrote:
>> Can we look at some example situations that this will break?
>
> Any code that relies on countUntil to count dchar's? Or, to 
> generalize, almost any code that uses std.algorithm functions 
> with string?

This is a pretty fragile design in the first place, since we use 
the same basic type (integers) to count two different things 
(code units / code points). Code that relies on this behavior 
would need to be explicitly tested with Unicode data to be sure 
that it works correctly - otherwise, it will only appear at a 
glance that it works right if it's only tested with ASCII.

Correct code where these indices never left the equation will not 
be affected, e.g.:

auto s = "日本語";
auto x = s.countUntil("本語"); // was 1, will be 3
s = s.drop(x);
assert(s == "本語"); // still OK

>> Thinking about dstrings as character arrays is less flawed 
>> only to a certain extent.
>
> Sure. But I find this extent practical enough to make the 
> difference. It is good compromise between perfectly correct 
> (and very slow) string processing and having your program 
> unusable with anything but basic latin symbol set.

I think that if we are to draw a line somewhere on what to 
support and not, the decision should not be embedded as deep into 
the language. Ideally, it would be clearly visible in the code 
that you are counting code points.


More information about the Digitalmars-d mailing list