Why are string literals zero-terminated?

awishformore awishformore at gmail.com
Tue Jul 20 05:59:18 PDT 2010


Following this discussion on announce, I was wondering why string 
literals are zero-terminated. Or to re-formulate, why only string 
literals are zero-terminated. Why that inconsistency? What's the 
rationale behind it? Does anyone know?

/Max

 >>>> Did you test with a string that was not in the code itself, e.g. 
from a
 >>>> config file?
 >>>> String literals are null terminated so you wouldn't have had an 
issue if
 >>>> all your strings were literals.
 >>>> Utf8 doesn't contain the string length, so you will run in to problems
 >>>> eventually.
 >>>>
 >>>> You have to use toStringz or your own null terminator. Unless of 
course
 >>>> you know that the function will always be
 >>>> taking string literals. But even then leaving something like that 
up to
 >>>> the programmer to remember is not exactly
 >>>> fool proof.
 >>>>
 >>>> Enjoy.
 >>>> ~Rory
 >>>
 >>> Hey again and thanks for the hint. I tried finding something on the DM
 >>> page about string literals being null terminated and while the section
 >>> about string literals didn't even mention it, it was said some place
 >>> else.
 >>>
 >>> That explains why using string literals works even though I expected
 >>> it to fail. It's indeed good to know and adding std.string.toStringz
 >>> is probably a good idea ;). Thanks.
 >>>
 >>> Greetings, Max.
 >>
 >> sure, I must admit it is annoying when the same code can do different
 >> things just because of where the data came
 >> from. It would be easier to notice the bug if d never added a null on
 >> literals, but then there would also be a lot more
 >> usages of toStringz.
 >>
 >> I think if you want to test it you can do:
 >> auto s = "blah";
 >> open(s[0..$].dup.ptr); // duplicating it should put it somewhere else
 >> // just slicing will not test
 >
 > When thinking about it, it makes sense to have string literals null 
terminated in order to have C functions work with them. However, I 
wonder about some stuff, for instance:
 >
 > string s = "string";
 > // is s == "string\0" now?
 > char[] c = cast(char[])s;
 > // is c[6] == '\0' now?
 > char* p = s.ptr;
 > // is *(p+6) == '\0' now?
 >
 > I think use of the zero terminator should be consistent. Either make 
every string (and char[] for that matter) zero terminated in the 
underlying memory for backwards compatibility with C or leave it to the 
user in all cases.
 >
 > /Max

perhaps the NULL is there because its there in the executable file?
NULL is also often after a dynamic array simply because of d always 
initializing memory, and
when you get an allocation often a larger amount is allocated which 
remains NULL.


More information about the Digitalmars-d-learn mailing list