Is autodecoding being phased out?

Tue Feb 21 07:24:18 PST 2017

On Tuesday, February 21, 2017 14:46:10 Dukc via Digitalmars-d-learn wrote:
> I (finally) managed to build the development build of dmd, with
> libraries. When testing if it compiles a Hello World program (it
> does, no problem) I got these messages:
>
> C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(2716,24):
> Deprecation: function std.utf.toUTF8 is deprecated - To be
> removed November 2017. Please use std.utf.encode instead.
> C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(2716,24):
> Deprecation: function std.utf.toUTF8 is deprecated - To be
> removed November 2017. Please use std.utf.encode instead.
> C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(2727,40):
> Deprecation: function std.utf.toUTF8 is deprecated - To be
> removed November 2017. Please use std.utf.encode instead.
>
> If I output a dstring instead, those messages vanish. Does that
> mean we're getting rid of autodecoding?
>
> If that's the case, have nothing against that. In fact it is nice
> to have that deprecation to catch bugs. I just thought, due to an
> earlier forum discussion, that it's not going to happen because
> it could break too much code. That's why I'm asking...

Well, hello world shouldn't be printing deprecation messages, so something
needs to be fixed. But the only version of std.utf.toUTF8 that's being
deprecated is the version that takes a static array, because it does the
same thing as std.utf.encode. So, aside from the fact that something in
Phobos apparently needs to be updated to not use that overload of toUTF8, it
probably doesn't affect you.

Certainly, as it stands, auto-decoding is not going to be phased out - if
nothing else because we don't have a clean way to do it. The code is slowly
being improved so that it works with general character ranges, and stuff
like byCodeUnit has been added, so less in Phobos relies on autodecoding,
and we have better ways to avoid it, but to actually remove it such that
str.front and str.popFront don't auto-decode would break code, and no one
has come up with a way to make the necessary changes without breaking code.

Andrei wants to add RCString (or whatever it's going to be called) which
would then be a reference counted string with small string optimizations and
push for that to be used as the typical string type for code to use, and it
wouldn't do autodecoding. So, maybe at some point, a lot of strings being
used in D code won't autodecode, but as it stands, it's looking like we're
permanently screwed with regards to arrays of char and wchar.

Maybe once enough of Phobos has been fixed to work with arbitrary ranges of
characters, we can find a way to force the transition, but I doubt it.

- Jonathan M Davis