toStringz or not toStringz

Regan Heath regan at netmail.co.nz
Tue Jul 12 08:41:56 PDT 2011


On Tue, 12 Jul 2011 15:59:58 +0100, Steven Schveighoffer  
<schveiguy at yahoo.com> wrote:

> On Tue, 12 Jul 2011 10:50:07 -0400, Regan Heath <regan at netmail.co.nz>  
> wrote:

>>> What if you expect the function is expecting to write to the buffer,  
>>> and the compiler just made a copy of it?  Won't that be pretty  
>>> surprising?
>>
>> Assuming a C function in this form:
>>
>>    void write_to_buffer(char *buffer, int length);
>
> No, assuming C function in this form:
>
> void ucase(char* str);
>
> Essentially, a C function which takes a writable already-null-terminated  
> string, and writes to it.

Ok, that's an even better example for my case.

It would be used/called like...

   char[] foo;
   .. code which populates foo with something ..
   ucase(foo);

and in D today this would corrupt memory.  Unless the programmer  
remembered to write:

   ucase(toStringz(foo));

So, +1 for compiler called toStringz.

I am assuming also that if this idea were implemented it would handle  
things intelligently, like for example if when toStringz is called the  
underlying array is out of room and needs to be reallocated, the compiler  
would update the slice/reference 'foo' in the same way as it already does  
for an append which triggers a reallocation.

>> You might initially extern it as:
>>
>>    extern "C" void write_to_buffer(char *buffer, int length);
>>
>> And, you could call it one of 2 ways (legitimately):
>>
>>    char[] foo = new char[100];
>>    write_to_buffer(foo, foo.length);
>>
>> or:
>>
>>    char[100] foo;
>>    write_to_buffer(foo, foo.length);
>>
>> and in both cases, toStringz would do nothing as foo is zero terminated  
>> already (in both cases), or am I wrong about that?
>
> In neither case are they required to be null terminated.

True, but I was outlining the worst case scenario for my suggestion, not  
describing the real C function requirements.

In this particular case the extern "C" declaration (IMO) for this style of  
function should be one of:

   extern "C" void write_to_buffer(ubyte *buffer, int length);
   extern "C" void write_to_buffer(byte *buffer, int length);
   extern "C" void write_to_buffer(void *buffer, int length);

which would all be ignored by my suggestion.

> The only thing that guarantees null termination is a string literal.

string literals /and/ calling toStringz.

> Even "abc".dup is not going to be guaranteed to be null terminated.  For  
> an actual example, try "012345678901234".dup.  This should have a 0x0f  
> right after the last character.

Why 0x0f?  Does the allocator initialise array memory to it's offset from  
the start of the block or something?

I have just realised that char is initialised to 0xFF.  That is a problem  
as my two examples above would be arrays full of 0xFF, not \0.. meaning  
toStringz would have to reallocate to append \0 to them, drat.  That is  
yet another reason to use ubyte or byte when interfacing with C.

Ok, how about going the other way.  Can we have something to decorate  
extern "C" function parameters to trigger an implicit call of toStringz on  
them?

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/


More information about the Digitalmars-d mailing list