Why I chose D over Ada and Eiffel
monarch_dodra
monarchdodra at gmail.com
Tue Aug 20 12:26:15 PDT 2013
On Tuesday, 20 August 2013 at 16:40:21 UTC, Ramon wrote:
> Yes and no.
> While UTF-8 almost always is the most memory efficient
> representation of anything beyond ASCII it does have a property
> that can be troublesome a times, the difference between length
> and size of a string, i.e. the number of "characters" vs. the
> number of bytes used.
If trully you are using UTF-16 (which is what D uses), then no.
UTF16 is *also* a variable width encoding. If you need random
access, you should use UTF-32 (dstring). *THAT* uses a lot
memory, and should only be used as an "operating" format, before
storing back to UTF-8/16.
"non-variable" UTF-16 is called UCS-2 (I think). In any case,
it's not what D uses.
UCS-2 being a subset of UTF-16, you can always use wstrings, and
"assume" in is UCS-2, but:
* Most algorithms are UTF-16 aware, so *will* decode and walk
your UCS-2 stream the slow way.
* Nothing will prevent you from accidently inserting codepoints
from outside UCS-2 valid plane.
I don't recommend doint that.
Instead, you can find in std.encoding the UCSChar and UCSString
data types. I haven't used these much, but it's what you should
use if you are planning to store your strings in a random access
wide representation.
But we digress from the original point. I'm glad you are enjoying
your time with D :)
One of the things I love about D is how the *language* makes
stupid constructs outright illegal (for example "for( ... );"
notice that semi-colon? yeah...)
I work full-time using C++, and about once a week, I track down a
bug, and when I find it often turns out to be something stupid
that D would not have allowed.
More information about the Digitalmars-d
mailing list