The Case Against Autodecode

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Thu May 12 14:14:25 PDT 2016


On Thu, May 12, 2016 at 08:24:23PM +0000, Vladimir Panteleev via Digitalmars-d wrote:
> On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
[...]
> >1. Ranges of characters do not autodecode, but arrays of characters
> >do.  This is a glaring inconsistency.
> >
> >2. Every time one wants an algorithm to work with both strings and
> >ranges, you wind up special casing the strings to defeat the
> >autodecoding, or to decode the ranges. Having to constantly special
> >case it makes for more special cases when plugging together
> >components. These issues often escape detection when unittesting
> >because it is convenient to unittest only with arrays.

Example of string special-casing leading to bugs:

	https://issues.dlang.org/show_bug.cgi?id=15972

This particular issue highlight the problem quite well: one would hardly
expect '#'.repeat(i) to return anything but a range of char. After all,
how could a single char need to be "auto-decoded" to a dchar?
Unfortunately, due to Phobos algorithms assuming autodecoding, the
resulting range of char is not recognized as "string-like" data by
.joiner, thus causing a compile error.

The workaround (as described in the bug comments) also illustrates the
inconsistency in handling ranges of char vs. ranges of dchar: writing
.joiner("\n".byCodeUnit) will actually fix the problem, basically by
explicitly disabling autodecoding.

We can, of course, fix .joiner to recognize this case and handle it
correctly, but the fact the using .byCodeUnit works perfectly proves
that autodecoding is not necessary here. Which begs the question, why
have autodecoding at all, and then require .byCodeUnit to work around
issues it causes?


T

-- 
It is widely believed that reinventing the wheel is a waste of time; but I disagree: without wheel reinventers, we would be still be stuck with wooden horse-cart wheels.


More information about the Digitalmars-d mailing list