[Dlang-study] [rcstring] Defining rcstring

Михаил Страшун public at dicebot.lv
Sat Feb 6 11:20:43 PST 2016


On 02/03/2016 12:52 PM, Jonathan M Davis wrote:
> On Wednesday, February 03, 2016 06:46:53 Михаил Страшун wrote:
>> Element type of `byCodeUnit` should be `ubyte` in my opinion so that it
>> becomes clear each separate element is not a valid char on its own.
> By definition, char is a UTF-8 code unit, wchar is a UTF-16 code unit, and
> dchar is a UTF-32 code unit, and so code is supposed to be able to assume
> that. So, I don't see why it would make sense to use ubyte for a code unit.
> We already have types which are explictly for code units.
>
> Now, by that same token, having the I/O stuff use ubyte rather than char (as
> you suggested elsewhere in your post) does make a lot of sense precisely
> because there's no guarantee that what's read in is actually in UTF-8, and
> any code where it's not sure really should be using ubyte, ushort, or ulong
> instead of char, wchar, or dchar. Having the I/O functions assume UTF-8 was
> definitely a mistake IMHO, much as it usually works. But the strings
> themselves are supposed to be UTF-8, UTF-16, or UTF-32. So, IMHO, RCString
> should be operating on chars, wchars, or dchars and not ubytes, ushorts, or
> ulongs.
>
> - Jonathan M Davis

You are absolutely correct from the point of view of initial language
definition.

However my main concern about `char` element type of such range is how
it will interact
with old string/char[] types (which won't be changed). It will cause the
same unobvious
corner case in generic code:

void foo (R) (R range)
    if (is(InputRange!R))
{
    auto eager_sequence = lazy_range.array;
    // will fail if range is code unit range:
    static assert (is(ElementType!R ==
ElementType!(typeof(eager_sequence)));
}

I think not having to add special cases for new strings in Phobos should
be a major design goal.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.puremagic.com/pipermail/dlang-study/attachments/20160206/a40dc927/attachment.sig>


More information about the Dlang-study mailing list