The Case Against Autodecode

Mon May 30 13:30:14 PDT 2016

On 30.05.2016 21:28, Andrei Alexandrescu wrote:
> On 05/30/2016 03:04 PM, Timon Gehr wrote:
>> On 30.05.2016 18:01, Andrei Alexandrescu wrote:
>>> On 05/28/2016 03:04 PM, Walter Bright wrote:
>>>> On 5/28/2016 5:04 AM, Andrei Alexandrescu wrote:
>>>>> So it harkens back to the original mistake: strings should NOT be
>>>>> arrays with
>>>>> the respective primitives.
>>>>
>>>> An array of code units provides consistency, predictability,
>>>> flexibility, and performance. It's a solid base upon which the
>>>> programmer can build what he needs as required.
>>>
>>> Nope. Not buying it.
>>
>> I'm buying it. IMO alias string=immutable(char)[] is the most useful
>> choice, and auto-decoding ideally wouldn't exist.
>
> Wouldn't D then be seen (and rightfully so) as largely not supporting
> Unicode, seeing as its many many core generic algorithms seem to
> randomly work or not on arrays of characters?

In D, enum does not mean enumeration, const does not mean constant, pure 
is not pure, lazy is not lazy, and char does not mean character.

> Wouldn't ranges - the most
> important artifact of D's stdlib - default for strings on the least
> meaningful approach to strings (dumb code units)?

I don't see how that's the least meaningful approach. It's the data that 
you actually have sitting in memory. It's the data that you can slice 
and index and get a length for in constant time.

> Would a smattering of
> Unicode primitives in std.utf and friends entitle us to claim D had dyed
> Unicode in its wool? (All are not rhetorical.)
>...

We should support Unicode by having all the required functionality and 
properly documenting the data formats used. What is the goal here? I.e. 
what does a language that has "Unicode dyed in its wool" have that other 
languages do not? Why isn't it enough to provide data types for 
UTF8/16/32 and Unicode algorithms operating on them?

> I.e. wouldn't be in a worse place than now? (This is rhetorical.) The
> best argument for autodecoding is to contemplate where we'd be without
> it: the ghetto of Unicode string handling.
> ...

Those questions seem to be mostly marketing concerns. I'm more concerned 
with whether I find it convenient to use. Autodecoding does not improve 
Unicode support.

> I'm not going to debate this further (though I'll look for meaningful
> answers to the questions above). But this thread has been informative in
> that it did little to change my conviction that autodecoding is a good
> thing for D, all things considered (i.e. the wrong decision to not
> encapsulate string as a separate type distinct from bare array of code
> units). I'd lie if I said it did nothing. It did, but only a little.
>
> Funny thing is that's not even what's important. What's important is
> that autodecoding is here to stay - there's no realistic way to
> eliminate it from D. So the focus should be making autodecoding the best
> it could ever be.
>
>
> Andrei
>

Sure, I didn't mean to engage in a debate (it seems there is no decision 
to be made here that might affect me in the future).