Major performance problem with std.array.front()
Sarath Kodali
sarath at dummy.com
Fri Mar 7 14:35:46 PST 2014
On Friday, 7 March 2014 at 20:43:45 UTC, Vladimir Panteleev wrote:
> On Friday, 7 March 2014 at 19:57:38 UTC, Andrei Alexandrescu
> wrote:
>> Allow me to enumerate the functions of std.algorithm and how
>> they work today and how they'd work with the proposed change.
>> Let s be a variable of some string type.
>
>> s.canFind('é') currently works as expected.
>
> No, it doesn't.
>
> import std.algorithm;
>
> void main()
> {
> auto s = "cassé";
> assert(s.canFind('é'));
> }
>
> That's the whole problem - all this hot steam and it still does
> not work properly. Because it can't - not without pulling in
> all of the Unicode algorithms implicitly, and that would be
> much worse.
>
>> I went down std.algorithm in the order listed in its
>> documentation and found pernicious issues with almost every
>> single algorithm.
>
> All of your examples are variations of one and the same case:
> searching for a non-ASCII dchar or dchar literal.
>
> How often does this pattern occur in real programs? I think the
> only real metric is to try the change and find out.
>
>> Clearly one might argue that their app has no business dealing
>> with diacriticals or Asian characters. But that's the typical
>> provincial view that marred many languages' approach to UTF
>> and internationalization.
>
> So is yours, if you think that making everything magically a
> dchar is going to solve all problems.
>
> The TDPL example only showcases the problem. Yes, it works with
> Swedish. Now try it again with Sanskrit.
+1
In Indian languages, a character consists of one or more UNICODE
code points. For example, in Sanskrit "ddhrya"
http://en.wikipedia.org/wiki/File:JanaSanskritSans_ddhrya.svg
consists of 7 UNICODE code points. So to search for this char I
have to use string search.
- Sarath
More information about the Digitalmars-d
mailing list