Creeping Bloat in Phobos
Uranuz via Digitalmars-d
digitalmars-d at puremagic.com
Sun Sep 28 05:06:16 PDT 2014
On Sunday, 28 September 2014 at 00:13:59 UTC, Andrei Alexandrescu
wrote:
> On 9/27/14, 3:40 PM, H. S. Teoh via Digitalmars-d wrote:
>> If we can get Andrei on board, I'm all for killing off
>> autodecoding.
>
> That's rather vague; it's unclear what would replace it. --
> Andrei
I believe that removing autodeconding will make things even
worse. As far as understand if we will remove it from front()
function that operates on narrow strings then it will return just
byte of char. I believe that proceeding on narrow string by `user
perceived chars` (graphemes) is more common use case. Operating
on single bytes of multibyte character is uncommon task and you
can do that via direct indexing of char[] array. I believe what
number of bytes is in *user perceived chars* is internal
implementation of UTF-8 encoding and it should not be considered
in common tasks such as parsing, searching, replacing text and
etc. If you need byte representation of string you should cast it
into ubyte[] and work with it using the same range functions
without autodecoding.
The main problem that I see that unexpirienced in D programmer
can be confused where he operates by bytes or by graphemes.
Especially it could happen when he migrates from C#, Python where
string is not considered as array of it's bytes. Because *char*
in D is not char it's a part of char, but not entire char. It's
main inconsistence.
Possible solution is to include class or struct implementation of
string and hide internal implementation of narrow string for
those users who don't need to operate on single bytes of UTF-8
characters. I believe it's the best way to kill all the rabbits))
We could provide this class String with method returning ubyte[]
(better way) or char[] that will expose internal implementation
for those who need it.
A question: can you list some languages that represent UTF-8
narrow strings as array of single bytes?
More information about the Digitalmars-d
mailing list