toStringz or not toStringz

Tue Jul 12 07:59:58 PDT 2011

On Tue, 12 Jul 2011 10:50:07 -0400, Regan Heath <regan at netmail.co.nz>  
wrote:

> On Tue, 12 Jul 2011 15:18:04 +0100, Steven Schveighoffer  
> <schveiguy at yahoo.com> wrote:
>
>> On Tue, 12 Jul 2011 09:54:15 -0400, Regan Heath <regan at netmail.co.nz>  
>> wrote:
>>
>>> On Fri, 08 Jul 2011 18:59:47 +0100, Walter Bright  
>>> <newshound2 at digitalmars.com> wrote:
>>>
>>>> On 7/8/2011 4:53 AM, Regan Heath wrote:
>>>>> On Fri, 08 Jul 2011 10:49:08 +0100, Walter Bright  
>>>>> <newshound2 at digitalmars.com>
>>>>> wrote:
>>>>>
>>>>>> On 7/8/2011 2:26 AM, Regan Heath wrote:
>>>>>>> Why can't we have the
>>>>>>> compiler call it automatically whenever we pass a string, or  
>>>>>>> char[] to an extern
>>>>>>> "C" function, where the parameter is defined as char*?
>>>>>>
>>>>>> Because char* in C does not necessarily mean "zero terminated  
>>>>>> string".
>>>>>
>>>>> Sure, but in many (most?) cases it does. And in those cases where it  
>>>>> doesn't you
>>>>> could argue ubyte* or byte* should have been used in the D extern "C"
>>>>> declaration instead. Plus, in those cases, worst case scenario, D  
>>>>> passes an
>>>>> extra \0 byte to those functions which either ignore it because they  
>>>>> were also
>>>>> passed a length, or expect a fixed sized structure, or .. I don't  
>>>>> know what as I
>>>>> can't imagine another case where char* would be used without it  
>>>>> being a "zero
>>>>> terminated string", or passing/knowing the length ahead of time.
>>>>
>>>> In the worst case, you're adding an extra memory allocation and  
>>>> function call overhead (that is hidden to the user, and not  
>>>> turn-off-able). This is not acceptable when interfacing to C.
>>>
>>> This worst case only happens when:
>>> 1. The extern "C" function takes a char* and is NOT expecting a "zero  
>>> terminated string".
>>> 2. The char[], string, etc being passed is a fixed length array, or a  
>>> slice which has no available space left for the \0.
>>>
>>> So, it's rare.  I would guess a less than 1% of cases for general  
>>> programming.
>>
>> What if you expect the function is expecting to write to the buffer,  
>> and the compiler just made a copy of it?  Won't that be pretty  
>> surprising?
>
> Assuming a C function in this form:
>
>    void write_to_buffer(char *buffer, int length);

No, assuming C function in this form:

void ucase(char* str);

Essentially, a C function which takes a writable already-null-terminated  
string, and writes to it.

> You might initially extern it as:
>
>    extern "C" void write_to_buffer(char *buffer, int length);
>
> And, you could call it one of 2 ways (legitimately):
>
>    char[] foo = new char[100];
>    write_to_buffer(foo, foo.length);
>
> or:
>
>    char[100] foo;
>    write_to_buffer(foo, foo.length);
>
> and in both cases, toStringz would do nothing as foo is zero terminated  
> already (in both cases), or am I wrong about that?

In neither case are they required to be null terminated.  The only thing  
that guarantees null termination is a string literal.  Even "abc".dup is  
not going to be guaranteed to be null terminated.  For an actual example,  
try "012345678901234".dup.  This should have a 0x0f right after the last  
character.

-Steve