Why I chose D over Ada and Eiffel

monarch_dodra monarchdodra at gmail.com
Tue Aug 20 12:26:15 PDT 2013


On Tuesday, 20 August 2013 at 16:40:21 UTC, Ramon wrote:
> Yes and no.
> While UTF-8 almost always is the most memory efficient 
> representation of anything beyond ASCII it does have a property 
> that can be troublesome a times, the difference between length 
> and size of a string, i.e. the number of "characters" vs. the 
> number of bytes used.

If trully you are using UTF-16 (which is what D uses), then no. 
UTF16 is *also* a variable width encoding. If you need random 
access, you should use UTF-32 (dstring). *THAT* uses a lot 
memory, and should only be used as an "operating" format, before 
storing back to UTF-8/16.

"non-variable" UTF-16 is called UCS-2 (I think). In any case, 
it's not what D uses.

UCS-2 being a subset of UTF-16, you can always use wstrings, and 
"assume" in is UCS-2, but:
* Most algorithms are UTF-16 aware, so *will* decode and walk 
your UCS-2 stream the slow way.
* Nothing will prevent you from accidently inserting codepoints 
from outside UCS-2 valid plane.

I don't recommend doint that.
Instead, you can find in std.encoding the UCSChar and UCSString 
data types. I haven't used these much, but it's what you should 
use if you are planning to store your strings in a random access 
wide representation.

But we digress from the original point. I'm glad you are enjoying 
your time with D :)

One of the things I love about D is how the *language* makes 
stupid constructs outright illegal (for example  "for( ... );" 
notice that semi-colon? yeah...)

I work full-time using C++, and about once a week, I track down a 
bug, and when I find it often turns out to be something stupid 
that D would not have allowed.


More information about the Digitalmars-d mailing list