Major performance problem with std.array.front()

Mon Mar 10 03:52:01 PDT 2014

I'm not sure I understood the point of this (long) thread.
The main problem is that decode() is called also if not needed?

Well, in this case that's not a problem only for string. I found
this problem also when I was writing other ranges. For example
when I read binary data from db stream. Front represent a single
row, and I decode it every time also if not needed.

On Friday, 7 March 2014 at 02:37:11 UTC, Walter Bright wrote:
> In "Lots of low hanging fruit in Phobos" the issue came up 
> about the automatic encoding and decoding of char ranges.
>
> Throughout D's history, there are regular and repeated 
> proposals to redesign D's view of char[] to pretend it is not 
> UTF-8, but UTF-32. I.e. so D will automatically generate code 
> to decode and encode on every attempt to index char[].
>
> I have strongly objected to these proposals on the grounds that:
>
> 1. It is a MAJOR performance problem to do this.
>
> 2. Very, very few manipulations of strings ever actually need 
> decoded values.
>
> 3. D is a systems/native programming language, and 
> systems/native programming languages must not hide the 
> underlying representation (I make similar arguments about 
> proposals to make ints issue errors on overflow, etc.).
>
> 4. Users should choose when decode/encode happens, not the 
> language.
>
> and I have been successful at heading these off. But one 
> slipped by me. See this in std.array:
>
>   @property dchar front(T)(T[] a) @safe pure if 
> (isNarrowString!(T[]))
>   {
>     assert(a.length, "Attempting to fetch the front of an empty 
> array of " ~
>            T.stringof);
>     size_t i = 0;
>     return decode(a, i);
>   }
>
> What that means is that if I implement an algorithm that 
> accepts, as input, an InputRange of char's, it will ALWAYS try 
> to decode it. This means that even:
>
>    from.copy(to)
>
> will decode 'from', and then re-encode it for 'to'. And it will 
> do it SILENTLY. The user won't notice, and he'll just assume 
> that D performance sux. Even if he does notice, his options to 
> make his code run faster are poor.
>
> If the user wants decoding, it should be explicit, as in:
>
>     from.decode.copy(encode!to)
>
> The USER should decide where and when the decoding goes. 
> 'decode' should be just another algorithm.
>
> (Yes, I know that std.algorithm.copy() has some specializations 
> to take care of this. But these specializations would have to 
> be written for EVERY algorithm, which is thoroughly 
> unreasonable. Furthermore, copy()'s specializations only apply 
> if BOTH source and destination are arrays. If just one is, the 
> decode/encode penalty applies.)
>
> Is there any hope of fixing this?