length of string result not as expected

Jeremy DeHaan dehaan.jeremiah at gmail.com
Tue Aug 13 20:16:04 PDT 2013


On Wednesday, 14 August 2013 at 02:53:43 UTC, jicman wrote:
>
> Greetings.
>
> import std.stdio;
>
> void main()
> {
>   char[] str = "不良反應事件和產品客訴報告"; // 13 chinese characters...
>   writefln(str.length);
> }
>
> this program returns 39.  I expected to return 13.  How do I 
> know the exact length of the characters that I have in a char[] 
> variable?  Thanks.
>
> josé

What version of DMD are you using? This code doesn't even compile 
for me. It gives me errors about not being able to convert type 
string to char[], like it should since a string literal is 
immutable data. To test the code I changed char[] to string. I 
also got an error for "writefln(str.length);" so I just changed 
that to "writeln(str.length);"

Anyways, from what I understand, the reason you get this is 
because each of those characters is greater than a single 8 byte 
representation. D's chars are utf-8, so that means it takes more 
than a single char to store the data needed to represent one of 
the chinese characters. str.length will give you the length of 
the string with respect to each char it contains. You have 13 
characters in your string, but you need 39 chars to store the 
data to represent them.

Alternatively,  you can use a different encoding to see the 
actual number of characters in your string, eg. wstring or 
dstring. I usually use dstrings when working with unicode 
personally.


More information about the Digitalmars-d-learn mailing list