length of string result not as expected

jicman cabrera at wrc.xerox.com
Tue Aug 13 20:30:57 PDT 2013


On Wednesday, 14 August 2013 at 03:16:08 UTC, Jeremy DeHaan wrote:
> On Wednesday, 14 August 2013 at 02:53:43 UTC, jicman wrote:
>>
>> Greetings.
>>
>> import std.stdio;
>>
>> void main()
>> {
>>  char[] str = "不良反應事件和產品客訴報告"; // 13 chinese characters...
>>  writefln(str.length);
>> }
>>
>> this program returns 39.  I expected to return 13.  How do I 
>> know the exact length of the characters that I have in a 
>> char[] variable?  Thanks.
>>
>> josé
>
> What version of DMD are you using? This code doesn't even 
> compile for me. It gives me errors about not being able to 
> convert type string to char[], like it should since a string 
> literal is immutable data. To test the code I changed char[] to 
> string. I also got an error for "writefln(str.length);" so I 
> just changed that to "writeln(str.length);"
>
> Anyways, from what I understand, the reason you get this is 
> because each of those characters is greater than a single 8 
> byte representation. D's chars are utf-8, so that means it 
> takes more than a single char to store the data needed to 
> represent one of the chinese characters. str.length will give 
> you the length of the string with respect to each char it 
> contains. You have 13 characters in your string, but you need 
> 39 chars to store the data to represent them.
>
> Alternatively,  you can use a different encoding to see the 
> actual number of characters in your string, eg. wstring or 
> dstring. I usually use dstrings when working with unicode 
> personally.

This is D1. Forgot to mention that.  I am still in the old ages. 
:-)  thanks for the insight.  I figured that much, but I need to 
know go and try to figure out what to do with both western 
character set as well as the asian, hebrew, etc.  Thanks.


More information about the Digitalmars-d-learn mailing list