odd behavior of split() function
Jonathan M Davis
jmdavisProg at gmx.com
Fri Jun 7 00:29:33 PDT 2013
On Friday, June 07, 2013 09:18:57 Bedros wrote:
> I would like to split "A+B+C+D" into "A", "B", "C", "D"
>
> but when using split() I get
>
> "A+B+C+D", "B+C+D", "C+D", "D"
>
>
> the code is below
>
>
> import std.stdio;
> import std.string;
> import std.array;
>
> int main()
> {
> string [] str_list;
> string test_str = "A+B+C+D";
> str_list = test_str.split("+");
> foreach(item; str_list)
> printf("%s\n", cast(char*)item);
>
> return 0;
> }
That would be because of your misuse of printf. If you used
foreach(item; str_list)
writeln(item);
you would have been fine. D string literals do happen to have a null character
one past their end so that you can pass them directly to C functions, but D
strings in general are _not_ null terminated, and printf expects strings to be
null terminated. If you want to convert a D string to a null terminated
string, you need to use std.string.toStringz, not a cast. You should pretty
much never cast a D string to char* or const char* or any variant thereof. So,
you could have done
printf("%s\n", toStringz(item));
but I don't know why you'd want to use printf rather than writeln or writefln -
both of which (unlike printf) are typesafe and understand D types.
You got
"A+B+C+D", "B+C+D", "C+D", "D"
because the original string (being a string literal) had a null character one
past its end, and each of the strings returned by split was a slice of the
original string, and printf blithely ignored the actual boundaries of the
slice looking for the next null character that it happened to find in memory,
which - because they were all slices of the same string literal - happened to
be the end of the original string literal. And the strings printed differed,
because each slice started in a different portion of the underlying array.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list