Major performance problem with std.array.front()
Andrea Fontana
nospam at example.com
Mon Mar 10 03:52:01 PDT 2014
I'm not sure I understood the point of this (long) thread.
The main problem is that decode() is called also if not needed?
Well, in this case that's not a problem only for string. I found
this problem also when I was writing other ranges. For example
when I read binary data from db stream. Front represent a single
row, and I decode it every time also if not needed.
On Friday, 7 March 2014 at 02:37:11 UTC, Walter Bright wrote:
> In "Lots of low hanging fruit in Phobos" the issue came up
> about the automatic encoding and decoding of char ranges.
>
> Throughout D's history, there are regular and repeated
> proposals to redesign D's view of char[] to pretend it is not
> UTF-8, but UTF-32. I.e. so D will automatically generate code
> to decode and encode on every attempt to index char[].
>
> I have strongly objected to these proposals on the grounds that:
>
> 1. It is a MAJOR performance problem to do this.
>
> 2. Very, very few manipulations of strings ever actually need
> decoded values.
>
> 3. D is a systems/native programming language, and
> systems/native programming languages must not hide the
> underlying representation (I make similar arguments about
> proposals to make ints issue errors on overflow, etc.).
>
> 4. Users should choose when decode/encode happens, not the
> language.
>
> and I have been successful at heading these off. But one
> slipped by me. See this in std.array:
>
> @property dchar front(T)(T[] a) @safe pure if
> (isNarrowString!(T[]))
> {
> assert(a.length, "Attempting to fetch the front of an empty
> array of " ~
> T.stringof);
> size_t i = 0;
> return decode(a, i);
> }
>
> What that means is that if I implement an algorithm that
> accepts, as input, an InputRange of char's, it will ALWAYS try
> to decode it. This means that even:
>
> from.copy(to)
>
> will decode 'from', and then re-encode it for 'to'. And it will
> do it SILENTLY. The user won't notice, and he'll just assume
> that D performance sux. Even if he does notice, his options to
> make his code run faster are poor.
>
> If the user wants decoding, it should be explicit, as in:
>
> from.decode.copy(encode!to)
>
> The USER should decide where and when the decoding goes.
> 'decode' should be just another algorithm.
>
> (Yes, I know that std.algorithm.copy() has some specializations
> to take care of this. But these specializations would have to
> be written for EVERY algorithm, which is thoroughly
> unreasonable. Furthermore, copy()'s specializations only apply
> if BOTH source and destination are arrays. If just one is, the
> decode/encode penalty applies.)
>
> Is there any hope of fixing this?
More information about the Digitalmars-d
mailing list