More fun with autodecoding

Steven Schveighoffer schveiguy at gmail.com
Mon Aug 6 13:57:10 UTC 2018


I wanted to share a story where I actually tried to add a new type with 
autodecoding and failed.

I want to create a wrapper type that forwards an underlying range type 
but adds one feature -- tracking in the original range where you were. 
This is in a new library I'm writing for parsing.

So my first idea was I will just forward all methods from a given range 
manually -- I need to override certain ones which affect the offset into 
the original range.

However, typically parsing is done from text.

I realized, strings are a range of dchar, but I need the length and 
other things forwarded so they can be drop-in replacements for strings 
(I treat strings wstrings as character buffers in iopipe). However, 
phobos will then assume length() as the number of dchar elements, and 
assume it has indexing, etc.! Here is a case where I can't repeat the 
mistakes of phobos of auto-decoding for my own type! I never thought I'd 
have that problem...

So I thought, maybe I'll just alias this the underlying range and only 
override the parts that are needed. I end up with a nice tiny 
definition, and things are looking pretty good:

     static struct Result
     {
         private size_t pos;
         B _buffer;
         alias _buffer this;

         // implement the slice operations
         size_t[2] opSlice(size_t dim)(int start, int end) if (dim == 0)
         in
         { assert(start >= 0 && end <= _buffer.length); }
         do
         {
             return [start, end];
         }

         Result opIndex(size_t[2] dims)
         {
             return Result(pos + dims[0], _buffer[dims[0] .. dims[1]]);
         }

         void popFront()
         {
             import std.traits : isNarrowString;
             static if(isNarrowString!B)
             {
                 auto prevLen = _buffer.length;
                 _buffer.popFront;
                 pos += prevLen - _buffer.length;
             }
             else
             {
                 _buffer.popFront;
                 ++pos;
             }
         }

         // the specialized buffer reference accessor.
         @property auto bufRef()
         {
             return BufRef(pos, _buffer.length);
         }
     }

Note already the sucky part in popFront.

But then I got a surprise when I went to use it:

     import std.algorithm : splitter;
     auto buf = "hi there this is a sentence";
     auto split1 = buf.bwin.splitter; // specialized split range
     auto split2 = buf.splitter; // normal split range
     while(!split1.empty)
     {
         assert(split1.front == split2.front);
         assert(split1.front.bufRef.concrete(buf) == split2.front); // 
FAILS!
         split1.popFront;
         split2.popfront;
     }

What happened? It turns out, the splitter looks for length and indexing 
*OR* that it is a narrow string. Splitter is trying to ignore the fact 
that Phobos forces autodecoding on char arrays to achieve performance. 
With this taken into account, I think my type does not pass any of the 
constraints for any of the overloads (not 100% sure on that), so it 
devolves to just using the alias this'd element directly, completely 
circumventing the point of my wrapper. The error I get is "no member 
`bufRef` for type `string`".

My next attempt will be to use byCodeUnit when I detect a narrow string, 
which hopefully will work OK. But I'm not sure if the performance is 
going to be the same, since now it will likely FORCE autodecoding on the 
algorithms that have specialized versions to AVOID autodecoding (I think).

I'm very tempted to start writing my own parsing utilities and avoid 
using Phobos algorithms...

-Steve


More information about the Digitalmars-d mailing list