Why are string literals zero-terminated?
awishformore
awishformore at gmail.com
Tue Jul 20 05:59:18 PDT 2010
Following this discussion on announce, I was wondering why string
literals are zero-terminated. Or to re-formulate, why only string
literals are zero-terminated. Why that inconsistency? What's the
rationale behind it? Does anyone know?
/Max
>>>> Did you test with a string that was not in the code itself, e.g.
from a
>>>> config file?
>>>> String literals are null terminated so you wouldn't have had an
issue if
>>>> all your strings were literals.
>>>> Utf8 doesn't contain the string length, so you will run in to problems
>>>> eventually.
>>>>
>>>> You have to use toStringz or your own null terminator. Unless of
course
>>>> you know that the function will always be
>>>> taking string literals. But even then leaving something like that
up to
>>>> the programmer to remember is not exactly
>>>> fool proof.
>>>>
>>>> Enjoy.
>>>> ~Rory
>>>
>>> Hey again and thanks for the hint. I tried finding something on the DM
>>> page about string literals being null terminated and while the section
>>> about string literals didn't even mention it, it was said some place
>>> else.
>>>
>>> That explains why using string literals works even though I expected
>>> it to fail. It's indeed good to know and adding std.string.toStringz
>>> is probably a good idea ;). Thanks.
>>>
>>> Greetings, Max.
>>
>> sure, I must admit it is annoying when the same code can do different
>> things just because of where the data came
>> from. It would be easier to notice the bug if d never added a null on
>> literals, but then there would also be a lot more
>> usages of toStringz.
>>
>> I think if you want to test it you can do:
>> auto s = "blah";
>> open(s[0..$].dup.ptr); // duplicating it should put it somewhere else
>> // just slicing will not test
>
> When thinking about it, it makes sense to have string literals null
terminated in order to have C functions work with them. However, I
wonder about some stuff, for instance:
>
> string s = "string";
> // is s == "string\0" now?
> char[] c = cast(char[])s;
> // is c[6] == '\0' now?
> char* p = s.ptr;
> // is *(p+6) == '\0' now?
>
> I think use of the zero terminator should be consistent. Either make
every string (and char[] for that matter) zero terminated in the
underlying memory for backwards compatibility with C or leave it to the
user in all cases.
>
> /Max
perhaps the NULL is there because its there in the executable file?
NULL is also often after a dynamic array simply because of d always
initializing memory, and
when you get an allocation often a larger amount is allocated which
remains NULL.
More information about the Digitalmars-d-learn
mailing list