odd behavior of split() function

Benjamin Thaut code at benjamin-thaut.de
Fri Jun 7 03:09:37 PDT 2013


Am 07.06.2013 09:53, schrieb Bedros:
> first of all, many thanks for the quick reply.
>
> I'm learning D and it's just because of the habit I unconsciously used
> printf instead of writef
>
> thanks again.
>
> -Bedros
>
> On Friday, 7 June 2013 at 07:29:48 UTC, Jonathan M Davis wrote:
>> On Friday, June 07, 2013 09:18:57 Bedros wrote:
>>> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>>>
>>> but when using split() I get
>>>
>>> "A+B+C+D", "B+C+D", "C+D", "D"
>>>
>>>
>>> the code is below
>>>
>>>
>>> import std.stdio;
>>> import std.string;
>>> import std.array;
>>>
>>> int main()
>>> {
>>>       string [] str_list;
>>>       string test_str = "A+B+C+D";
>>>       str_list = test_str.split("+");
>>>       foreach(item; str_list)
>>>               printf("%s\n", cast(char*)item);
>>>
>>>       return 0;
>>> }
>>
>> That would be because of your misuse of printf. If you used
>>
>> foreach(item; str_list)
>>     writeln(item);
>>
>> you would have been fine. D string literals do happen to have a null
>> character
>> one past their end so that you can pass them directly to C functions,
>> but D
>> strings in general are _not_ null terminated, and printf expects
>> strings to be
>> null terminated. If you want to convert a D string to a null terminated
>> string, you need to use std.string.toStringz, not a cast. You should
>> pretty
>> much never cast a D string to char* or const char* or any variant
>> thereof. So,
>> you could have done
>>
>> printf("%s\n", toStringz(item));
>>
>> but I don't know why you'd want to use printf rather than writeln or
>> writefln -
>> both of which (unlike printf) are typesafe and understand D types.
>>
>> You got
>>
>> "A+B+C+D", "B+C+D", "C+D", "D"
>>
>> because the original string (being a string literal) had a null
>> character one
>> past its end, and each of the strings returned by split was a slice of
>> the
>> original string, and printf blithely ignored the actual boundaries of the
>> slice looking for the next null character that it happened to find in
>> memory,
>> which - because they were all slices of the same string literal -
>> happened to
>> be the end of the original string literal. And the strings printed
>> differed,
>> because each slice started in a different portion of the underlying
>> array.
>>
>> - Jonathan M Davis
>

You can use printf if you want to, the correct usage is not so nice though:

string str = "test";
printf("%.*s", str.length, str.ptr);

Kind Regards
Benjamin Thaut


More information about the Digitalmars-d-learn mailing list