Proposal: takeFront and takeBack

deadalnix deadalnix at gmail.com
Thu Jul 5 04:59:57 PDT 2012


Le 05/07/2012 01:06, Jonathan M Davis a écrit :
> Another option would be to create a wrapper range for strings to be used when
> they already have to be wrapped in another range. Functions which want to make
> string processing more efficient can already specialize on strings (and often
> do), so they can deal with the consumeFront issue directly if they need to
> (though their code might be a bit more concise with it in some cases). The
> problem is when a string is wrapped in another range (e.g the result of map or
> filter), you can't reasonably specialize on them anymore, and you're forced to
> deal with the inefficiency of using both front and popFront.
>
> So, if we created a wrapper range whose front caches the index of the next
> code unit for a subsequent call to popFront, and functions such as filter and
> map wrapped strings in that before wrapping them in their own range types,
> then the wrapped calls to popFront will be more efficient (at least in the case
> when front was called - the call to popFront would be slightly less efficient in
> the cases where front wasn't used). So, we could have something like this:
>
> struct StringCache(C)
>      if(is(Unqual!C == char) || is(Unqual!C == wchar))
> {
> public:
>
>      this(C[] str)
>      {
>          _str = str;
>      }
>
>      @property bool empty() { return _str.empty; }
>
>      @property dchar front()
>      {
>          import std.utf;
>          _nextIndex = 0;
>          return decode(_str, _nextIndex);
>      }
>
>      void popFront()
>      {
>          if(_nextIndex)
>          {
>              _str = _str[_nextIndex .. $];
>              _nextIndex = 0;
>          }
>          else
>              _str.popFront();
>      }
>
> private:
>
>      C[] _str;
>      size_t _nextIndex = 0;
> }
>
> auto stringCache(C)(C[] str)
>      if(isSomeChar!C)
> {
>      static if(is(Unqual!C == dchar))
>          return str;
>      else
>          return StringCache!C(str);
> }
>
> Obviously, it would need the range functions for forward range and
> bidirectional range added, but that gives the basic idea. And of course a name
> other than StringCache could be used (it's not great, but it's the best that I
> could come up with in the few minutes that I thought about it and the name
> isn't really the main issue here - the design is).
>
> And the performance in comparison to consumeFront is interesting. Without any
> optimizations turned on (not even -release), consumeFront takes about 72 - 79%
> as long as front and popFront separately, whereas StringCache takes about 107
> - 120% as long as front and popFront (so it's actually slower without any
> optimizations, unlike consumeFront). But with all optimizations on (-release -
> O -inline), consumeFront takes about 66 - 80% as long, and StringCache takes
> 69 - 75% as long. So, with optimizations, StringCache is roughly the same as
> consumeFront. The times are rough and the various runs of the same test vary a
> bit, so clearly stuff like the CPU cache and context switching are having an
> impact, making exact comparisons difficult, but I think that StringCache is fast
> enough to be comparable to consumeFront in speed, and it's a less invasive
> change to start using StringCache than it would be to use hasConsume and
> consumeFront. And actually, if stringCache were changed to operate on _any_
> range and then just wrap arrays of char[] and wchar[], then it would be even
> _less_ invasive, because it wouldn't even require special casing on the part
> of the caller. For instance, filter's return statement could go from
>
> return FilteredRange(rs);
>
> to
>
> return FilteredRange(stringCache(rs));
>
> and suddenly you get the optimization for anything using filter (though such
> usage definitely seems to indicate that a better name than stringCache should
> be found).
>
> Clearly we need to look at improving the performance of front and popFront as
> much as possible (and the StringCache implementation would benefit from those
> as well), but given the disgreements over consumeFront, it would seem that
> StringCache would be a better solution (certainly, it would directly impact
> less code), and unless front and popFront can be optimized to the point that
> StringCache adds no real measurable benefit (which I doubt will happen), I
> think that it would be worthwhile to add StringCache and have functions like
> filter and map start using it.
>
> I guess that the first thing that we need to do though is look at what can be
> done to improve the performance of front and popFront - the first thing
> probably being optimizations relating to the fact that they're always
> operating from index 0.
>
> Still, what do you folks think of this StringCache idea? Is it more acceptable
> than consumeFront, or do some of you still think that it's not worth having?
>
> - Jonathan M Davis

This is a way better idea IMO.


More information about the Digitalmars-d mailing list