Major performance problem with std.array.front()

Peter Alexander peter.alexander.au at gmail.com
Sun Mar 9 11:19:50 PDT 2014


On Sunday, 9 March 2014 at 17:48:47 UTC, Andrei Alexandrescu 
wrote:
> On 3/9/14, 10:34 AM, Peter Alexander wrote:
>> If we assume strings are normalized then substring search, 
>> equality
>> testing, sorting all work the same with either code units or 
>> code points.
>
> But others such as edit distance or equal(some_string, 
> some_wstring) will not.

equal(string, wstring) should either not compile, or would be 
overloaded to do the right thing. In an ideal world, char, wchar, 
and dchar should not be comparable.

Edit distance on code points is of questionable utility. Like 
Vladimir says, its meaning is pretty philosophical, even in ASCII 
(is "\r\n" really two "edits"? What is an "edit"?)


>> I can't think of any case where you would want to count 
>> characters.
>
> wc

% echo € | wc -c
4

:-)


> (Generally: I've always been very very very doubtful about 
> arguments that start with "I can't think of..." because I've 
> historically tried them so many times, and with terrible 
> results.)

Fair point... but it's not as if we would be removing the ability 
(you could always do s.byCodePoint.count); we are talking about 
defaults. The argument that we shouldn't iterate by code unit by 
default because people might want to count code points is without 
substance. Also, with the proposal, string.count(dchar) would 
encode the dchar to a string first for performance, so it would 
still work.

Anyway, I think this discussion isn't really going anywhere so I 
think I'll agree to disagree and retire.


More information about the Digitalmars-d mailing list