string is rarely useful as a function argument

Brad Anderson eco at gnuk.net
Fri Dec 30 23:59:49 PST 2011


On Sat, Dec 31, 2011 at 12:09 AM, Andrei Alexandrescu <
SeeWebsiteForEmail at erdani.org> wrote:

> On 12/30/11 10:09 PM, Walter Bright wrote:
>
>> On 12/30/2011 7:30 PM, Jonathan M Davis wrote:
>>
>>> Yes, diligent programmers will generally find such problems, but with the
>>> current scheme, it's _so_ easy to use length when you shouldn't, that
>>> it's
>>> pretty much a guarantee that it's going to happen.
>>>
>>
>> I'm not so sure about that. Timon Gehr's X macro tried to handle UTF-8
>> correctly, but it turned out that the naive version that used [i] and
>> .length worked correctly. This is typical, not exceptional.
>>
>
> The lower frequency of bugs makes them that much more difficult to spot.
> This is essentially similar to the UTF16/UCS-2 morass: in a vast majority
> of the time the programmer may consider UTF16 a coding with one code unit
> per code point (which is what UCS-2 is). The existence of surrogates didn't
> make much of a difference because, again, very often the wrong assumption
> just worked. Well that all didn't go over all that well.
>
> We need .raw and we must abolish .length and [] for narrow strings.
>
>
> Andrei
>


I don't know that Phobos would be an appropriate place for it but offering
some easy to access string data containing extensive and advanced unicode
which users could easily add to their programs unit tests may help people
ensure proper unicode usage. Unicode seems to be one of those things where
you either know it really well or you know just enough to get yourself in
trouble so having test data written by unicode experts could be very useful
for the rest of us mortals.

I googled around a bit.  This Stack Overflow came up <
http://stackoverflow.com/questions/6136800/unicode-test-strings-for-unit-tests>
that recommends these
 - UTF-8 stress test:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
 - Quick Brown Fox in a variety of languages:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/quickbrown.txt

I didn't see too much beyond those two.

Regards,
Brad A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20111231/89447a5f/attachment.html>


More information about the Digitalmars-d mailing list