string types: const(char)[] and cstring

Marcin Kuszczak aarti at interia.pl
Sat May 26 05:26:50 PDT 2007


Chris Miller wrote:

> Actually, while we're at a change for strings, why not bring in something
> similar to my dstring module, where slicing and indexing never result in
> an invalid UTF sequence? http://www.dprogramming.com/dstring.php - the
> code may not be ideal, but it's the concept I'm referring to.

Yup. That's my opinion also...

For me advantages of such a string are quite obvious:
1. Easy slicing and indexing of utf8 sequences (without corrupting this
sequence - as mention above)
2. Common denominator for char[], wchar[] and dchar[]
3. For classes which doesn't need speed it simplifies API (only one version
of functions instead of 3)
4. With some additional support from language (cast operators to different
types and opImplicitCast) it can be fully interchangeable with every method
taking char[], wchar[], dchar[].

Having another 3 names for string is not very appealing for me. We would
have 9 official versions of string available in D:
char[], wchar[], dchar[], string, cwstring, cdstring, tango String!(char),
tango String!(wchar), tango String!(dchar)

To write nice, fully functional library you have to write 3 versions of
every function which takes different string types (I know, templates makes
it a little bit easier). Probably I will not be wrong when I say that
reality is that people just write one version for char[], because it is
convenient (see: SWT ported from Java). It causes that wchar and dchar are
treated as second class citizens in D. Additionally when people design
their program for char[], they mostly don't think about issues with slicing
of char[] utf8 sequence (warning! assumption!), so default way of writing
programs is *NOT SAFE*. When you write code and don't care about bare metal
speed it is just tedious to do this additional work... 

Having one string, which hides differences between char[], wchar[] and
dchar[] would solve problem nicely. Adding constness would also be easy.
And you use only one reserved keyword - string - for everything.

I would be happy to hear some other opinions from people on NG. Maybe I am
wrong with above arguments, so probably someone can give
counterarguments... I think it is very important issue as it seems that
most developers over the world are non-native-english-speakers...

PS. See also thread on DWT NG.

-- 
Regards
Marcin Kuszczak (Aarti_pl)
-------------------------------------
Ask me why I believe in Jesus - http://zapytaj.dlajezusa.pl (en/pl)
Doost (port of few Boost libraries) - http://www.dsource.org/projects/doost/
-------------------------------------




More information about the Digitalmars-d-announce mailing list