string types: const(char)[] and cstring

Chris Miller chris at dprogramming.com
Sat May 26 02:14:35 PDT 2007


On Sat, 26 May 2007 04:35:34 -0400, Anders F Björklund <afb at algonet.se>  
wrote:

> Walter Bright wrote:
>
>> Why cstring? Because 'string' appears as both a module name and a  
>> common variable name. cstring also implies wstring for wchar strings,  
>> and dstring for dchars.
>
> I think cstring is a horrible name. "string" is much better, and in use.
> (else wouldn't those be wcstring and dcstring or cwstring and cdstring?)
>
> That it is made up of constant characters, and that those aren't really
> characters but instead UTF-8 code units is something that can be hidden.
>
> alias const(char)[] string;
>
> But "cstring" both sounds awkward, and also leads the mind to C strings.
> Even if those (char*) would probably be "stringz" in the usual D lingo.
>
> If any name conflict with previously existing "string" must be avoided,
> then "str" is probably a better name... (character->char, integer->int)
>
> As was discussed earlier.
>
> --anders

I agree, except I don't care much for "str". I'd prefer it named string.  
If it's an alias in object.d and not a keyword, it shouldn't be too bad.

Actually, while we're at a change for strings, why not bring in something  
similar to my dstring module, where slicing and indexing never result in  
an invalid UTF sequence? http://www.dprogramming.com/dstring.php - the  
code may not be ideal, but it's the concept I'm referring to.

While on strings, I'll mention another problem I have with D's string  
handling. "invalid utf8 sequence" (or, if you prefer, "4invalid utf8  
sequence"). Other Unicode implementations I've used do not throw such an  
exception, but interpret the bad parts as replacement characters (U+FFFD).  
I believe I've also heard that the Unicode standard also recommends being  
forgiving in this aspect.

- Chris



More information about the Digitalmars-d-announce mailing list