odd behavior of split() function

Bedros 2bedros at gmail.com
Fri Jun 7 00:53:07 PDT 2013


first of all, many thanks for the quick reply.

I'm learning D and it's just because of the habit I unconsciously 
used printf instead of writef

thanks again.

-Bedros

On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:
> On Friday, June 07, 2013 09:18:57 Bedros wrote:
>> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>> 
>> but when using split() I get
>> 
>> "A+B+C+D", "B+C+D", "C+D", "D"
>> 
>> 
>> the code is below
>> 
>> 
>> import std.stdio;
>> import std.string;
>> import std.array;
>> 
>> int main()
>> {
>>       string [] str_list;
>>       string test_str = "A+B+C+D";
>>       str_list = test_str.split("+");
>>       foreach(item; str_list)
>>               printf("%s\n", cast(char*)item);
>> 
>>       return 0;
>> }
>
> That would be because of your misuse of printf. If you used
>
> foreach(item; str_list)
>     writeln(item);
>
> you would have been fine. D string literals do happen to have a 
> null character
> one past their end so that you can pass them directly to C 
> functions, but D
> strings in general are _not_ null terminated, and printf 
> expects strings to be
> null terminated. If you want to convert a D string to a null 
> terminated
> string, you need to use std.string.toStringz, not a cast. You 
> should pretty
> much never cast a D string to char* or const char* or any 
> variant thereof. So,
> you could have done
>
> printf("%s\n", toStringz(item));
>
> but I don't know why you'd want to use printf rather than 
> writeln or writefln -
> both of which (unlike printf) are typesafe and understand D 
> types.
>
> You got
>
> "A+B+C+D", "B+C+D", "C+D", "D"
>
> because the original string (being a string literal) had a null 
> character one
> past its end, and each of the strings returned by split was a 
> slice of the
> original string, and printf blithely ignored the actual 
> boundaries of the
> slice looking for the next null character that it happened to 
> find in memory,
> which - because they were all slices of the same string literal 
> - happened to
> be the end of the original string literal. And the strings 
> printed differed,
> because each slice started in a different portion of the 
> underlying array.
>
> - Jonathan M Davis



More information about the Digitalmars-d-learn mailing list