RFC on range design for D2

Thu Jan 8 08:32:27 PST 2009

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
> Benji Smith wrote:
> > bearophile wrote:
> >> 5) The source code of the current algorithm module of D2 is already
> >> very complex to follow, it smells of over-generalization here and
> >> there. Sometimes it's better to reduce the generality of things, even
> >> if that reduces their power a little, to reduce complexity, etc. Tango
> >> code too isn't perfect, but it often looks more human. While you have
> >> created the algorithm module I too have created something similar, but
> >> based on different grounds.
> >
> > Along these same lines, while D is still young, the documentation is
> > often thin, and code examples are scarce.
> >
> > I've been doing a lot of programming lately with Tango, and despite the
> > growing documentation, I've had to refer directly to the Tango source
> > code on more occasions than I can count. Luckily, the Tango sources are
> > well-written and pretty easy to read and understand, and I've had very
> > few problems figuring out how to use the library.
> >
> > I hope the range implementation makes readability a high priority.
> Speaking of examples and readability, and to tie this with the
> discussion on array reallocation, I was curious on an array appender
> built on the output range interface. I've seen a quite complicated array
> builder in digitalmars.d. I wanted a simpler appender that should not do
> worse than the built-in ~= and that works with algorithm2 whenever data
> is written out.
> It turned out quite simple and it improved performance of a large-scale
> data preprocessing task of mine (which involved reading and parsing
> about 1M lines of integers) by 15%. I'd be curious how it fares with
> other tests that you guys may have.
> The idea is very simple: just use D's native append operation, but cache
> the capacity to avoid too many lookups (I understand that that's the
> bottleneck).
> I paste the code below, I'd be indebted if you guys grabbed it and
> tested it.
> Andrei
> struct ArrayAppender(T)
> {
>      private T* _buffer;
>      private size_t _length;
>      private size_t _capacity;
>      this(T[] array)
>      {
>          _buffer = array.ptr;
>          _length = array.length;
>          if (_buffer) _capacity = .capacity(array.ptr) / T.sizeof;
>      }
>      size_t capacity() const { return _capacity; }
>      void putNext(T item)
>      {
>          invariant desiredLength = _length + 1;
>          if (desiredLength <= _capacity)
>          {
>              // Should do in-place construction here
>              _buffer[_length] = item;
>          }
>          else
>          {
>              // Time to reallocate, do it and cache capacity
>              auto tmp = _buffer[0 .. _length];
>              tmp ~= item;
>              _buffer = tmp.ptr;
>              _capacity = .capacity(_buffer) / T.sizeof;
>          }
>          _length = desiredLength;
>      }
>      T[] release()
>      {
>          // Shrink-to-fit
>          auto result = cast(T[]) realloc(_buffer, _length * T.sizeof);
>          // Relinquish state
>          _buffer = null;
>          _length = _capacity = 0;
>          return result;
>      }
> }
> unittest
> {
>      auto app = arrayAppender(new char[0]);
>      string b = "abcdefg";
>      foreach (char c; b) app.putNext(c);
>      assert(app.release == "abcdefg");
> }

One definite problem that I've just realized is that there's no putNext(T[]).
What if you need to append another array to your ArrayAppender, not just a single
element?