toStringz or not toStringz
Steven Schveighoffer
schveiguy at yahoo.com
Tue Jul 12 07:59:58 PDT 2011
On Tue, 12 Jul 2011 10:50:07 -0400, Regan Heath <regan at netmail.co.nz>
wrote:
> On Tue, 12 Jul 2011 15:18:04 +0100, Steven Schveighoffer
> <schveiguy at yahoo.com> wrote:
>
>> On Tue, 12 Jul 2011 09:54:15 -0400, Regan Heath <regan at netmail.co.nz>
>> wrote:
>>
>>> On Fri, 08 Jul 2011 18:59:47 +0100, Walter Bright
>>> <newshound2 at digitalmars.com> wrote:
>>>
>>>> On 7/8/2011 4:53 AM, Regan Heath wrote:
>>>>> On Fri, 08 Jul 2011 10:49:08 +0100, Walter Bright
>>>>> <newshound2 at digitalmars.com>
>>>>> wrote:
>>>>>
>>>>>> On 7/8/2011 2:26 AM, Regan Heath wrote:
>>>>>>> Why can't we have the
>>>>>>> compiler call it automatically whenever we pass a string, or
>>>>>>> char[] to an extern
>>>>>>> "C" function, where the parameter is defined as char*?
>>>>>>
>>>>>> Because char* in C does not necessarily mean "zero terminated
>>>>>> string".
>>>>>
>>>>> Sure, but in many (most?) cases it does. And in those cases where it
>>>>> doesn't you
>>>>> could argue ubyte* or byte* should have been used in the D extern "C"
>>>>> declaration instead. Plus, in those cases, worst case scenario, D
>>>>> passes an
>>>>> extra \0 byte to those functions which either ignore it because they
>>>>> were also
>>>>> passed a length, or expect a fixed sized structure, or .. I don't
>>>>> know what as I
>>>>> can't imagine another case where char* would be used without it
>>>>> being a "zero
>>>>> terminated string", or passing/knowing the length ahead of time.
>>>>
>>>> In the worst case, you're adding an extra memory allocation and
>>>> function call overhead (that is hidden to the user, and not
>>>> turn-off-able). This is not acceptable when interfacing to C.
>>>
>>> This worst case only happens when:
>>> 1. The extern "C" function takes a char* and is NOT expecting a "zero
>>> terminated string".
>>> 2. The char[], string, etc being passed is a fixed length array, or a
>>> slice which has no available space left for the \0.
>>>
>>> So, it's rare. I would guess a less than 1% of cases for general
>>> programming.
>>
>> What if you expect the function is expecting to write to the buffer,
>> and the compiler just made a copy of it? Won't that be pretty
>> surprising?
>
> Assuming a C function in this form:
>
> void write_to_buffer(char *buffer, int length);
No, assuming C function in this form:
void ucase(char* str);
Essentially, a C function which takes a writable already-null-terminated
string, and writes to it.
> You might initially extern it as:
>
> extern "C" void write_to_buffer(char *buffer, int length);
>
> And, you could call it one of 2 ways (legitimately):
>
> char[] foo = new char[100];
> write_to_buffer(foo, foo.length);
>
> or:
>
> char[100] foo;
> write_to_buffer(foo, foo.length);
>
> and in both cases, toStringz would do nothing as foo is zero terminated
> already (in both cases), or am I wrong about that?
In neither case are they required to be null terminated. The only thing
that guarantees null termination is a string literal. Even "abc".dup is
not going to be guaranteed to be null terminated. For an actual example,
try "012345678901234".dup. This should have a 0x0f right after the last
character.
-Steve
More information about the Digitalmars-d
mailing list