Ascii matters

bearophile bearophileHUGS at lycos.com
Wed Aug 22 15:11:18 PDT 2012


I need to manage Unicode text, but in many cases I have lot of 
7-bit or 8-bit ASCII text to process, and this has lead to this 
discussion, so since some time thanks to Jonathan Davis we have 
an efficient translate() again:

http://d.puremagic.com/issues/show_bug.cgi?id=7515


The s2 array generated by this code is a dchar[] (if array() 
becomes pure you are probably able to assign type s2 as dstring):

string s = "test string"; // UTF-8, but also 7-bit ASCII
dchar[] s2 = map!(x => x)(s).array(); // Uses the Id function

To produce a char[] (or string, using assumeUnique), you are free 
to use a cast:

auto s3 = map!(x => cast(char)x)(s).array();

But D casts are unsafe, and one thing I'm learning from Haskell 
is how important is to give types to your code to prevent bugs. 
So maybe an AsciiString wrapper (a subtype of string) range can 
be invented for Phobos. Its consructor verifies the input is a 
7-big ASCII and its "front" method yields chars, so map.array() 
gives a char[]:

astring a1 = "test string"; // enforced 7-bit ASCII
char[] s4 = map!(x => x)(s).array();

This makes some algorithms working on ASCII text cleaner and 
safer, avoiding the need for casts.

Is creating something like this possible and appreciated for 
Phobos?

Bye,
bearophile


More information about the Digitalmars-d mailing list