Creeping Bloat in Phobos

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Sun Sep 28 16:21:14 PDT 2014


On 9/28/2014 1:39 PM, H. S. Teoh via Digitalmars-d wrote:
>> It can work just fine, and I wrote it. The problem is convincing
>> someone to pull it :-( as the PR was closed and reopened with
>> autodecoding put back in.
>
> The problem with pulling such PRs is that they introduce a dichotomy
> into Phobos. Some functions autodecode, some don't, and from a user's
> POV, it's completely arbitrary and random. Which leads to bugs because
> people can't possibly remember exactly which functions autodecode and
> which don't.

That's ALREADY the case, as I explained to bearophile.

The solution is not to have the ranges autodecode, but to have the ALGORITHMS 
decide to autodecode (if they need it) or not (if they don't).


>> As I've explained many times, very few string algorithms actually need
>> decoding at all. 'find', for example, does not. Trying to make a
>> separate universe out of autodecoding algorithms is missing the point.
> [...]
>
> Maybe what we need to do, is to change the implementation of
> std.algorithm so that it internally uses byCodeUnit for narrow strings
> where appropriate. We're already specialcasing Phobos code for narrow
> strings anyway, so it wouldn't make things worse by making those special
> cases not autodecode.

Those special cases wind up going everywhere and impacting everyone who attempts 
to write generic algorithms.


> This doesn't quite solve the issue of composing ranges, since one
> composed range returns dchar in .front composed with another range will
> have autodecoding built into it. For those cases, perhaps one way to
> hack around the present situation is to use Phobos-private enums in the
> wrapper ranges (e.g., enum isNarrowStringUnderneath=true; in struct
> Filter or something), that ranges downstream can test for, and do the
> appropriate bypasses.

More complexity :-( for what should be simple tasks.


> (BTW, before you pick on specific algorithms you might want to actually
> look at the code for things like find(), because I remember there were a
> couple o' PRs where find() of narrow strings will use (presumably) fast
> functions like strstr or strchr, bypassing a foreach loop over an
> autodecoding .front.)

Oh, I know that many algorithms have such specializations. Doesn't it strike you 
as sucky to have to special case a whole basket of algorithms when the 
InputRange does not behave in a reliable manner?

It's very simple for an algorithm to decode if it needs to, it just adds in a 
.byDchar adapter to its input range. Done. No special casing needed. The lines 
of code written drop in half. And it works with both arrays of chars, arrays of 
dchars, and input ranges of either.

---

The stalling of setExt() has basically halted my attempts to adjust Phobos so 
that one can write nothrow and @nogc algorithms that work on strings.


More information about the Digitalmars-d mailing list