The length of strings vs. # of chars vs. sizeof

Charles Hixson charleshixsn at earthlink.net
Sun Nov 1 11:36:31 PST 2009


I've read and re-read the documentation, but I can't decide whether a 
UTF-8 character that takes multiple bytes to express counts as one or 
multiple values in length and sizeof.  Sizeof seems to presume that all 
entries are the same length, but otherwise it seems to be the property I 
need.  (I suppose that I could just enter a string that I know is 
multi-byte chars, but it sure would be better if I could find out from 
the documentation.)  I'm pretty certain that it just counts as one 
character for indexing, so length would almost need to also count the 
number of characters rather than bytes.

Sizeof *should* be the correct property, and I've been assuming that it 
is, but I'm a bit afraid that I'll run across some unexpected character 
and it won't act the way I think it should.  And the documentation reads 
ambiguously.

Does anyone just *know* the answer?  (And if so, could they make the 
documentation explicit?)


More information about the Digitalmars-d-learn mailing list