interfacing with C: strings and byte vectors

Mike Parker via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Jun 11 03:26:17 PDT 2016


On Saturday, 11 June 2016 at 09:32:54 UTC, yawniek wrote:

>
> so far i defined vec_t as:
>
> struct vec_t {
>     char *base;
>     size_t len;
>
>     this(string s) { base = s.ptr; len = s.lenght; }
>
> nothrow @nogc inout(char)[] toString() inout @property { return 
> base[0 .. len]; }
>
>     nothrow @nogc @property const(char)[]  toSlice()
>     {
>         return cast(string)  base[0..len];
>     }
>     alias toString this;
> }
>
> but i guess there is a more elegant way?!

No, you've got the right idea with the constructor. You just need 
to change your implementation.

You have two big problems with your current constructor. First, 
only string literals in D (e.g. "foo") are guaranteed to be null 
terminated. If you construct a vec_t with a string that is not a 
literal, then you can have a problem on the C side where it is 
expected to be null terminated. Second, if the string you pass to 
the constructor was allocated on the GC heap and at some point is 
collected, then base member of the vec_t will point to an invalid 
memory location.

A simple approach would be:

this(string s)
{
     import std.string : toStringz;
     base = s.toStringz();
     len = s.length;
}

Second, your toString and toSlice implementations are potentially 
problematic. Aside from the fact that toString is returning a 
slice and toSlice is returning a string, the slice you create 
there will always point to the same location as base. If base 
becomes invalid, then so will the slice.

For the toString impelementation, you really should be doing 
something like this:

string toString() nothrow inout {
     import std.conv.to;
     return to!string(base);
}

I don't know what benefit you are expecting from the toSlice 
method. If you absolutely need it, you should return implement it 
like so:

char[] toSlice() nothrow inout {
     return base[0 .. len].dup;
}

This will give you a character array that is modifiable without 
worrying about the what happens to base. If you really want, you 
can declare the return as const(char)[].

Rather than toSlice, which isn't something the compiler is aware 
of, it would be much more appropriate to implement opSlice along 
with opDollar. Then it's possible to do this:

auto slice = myVec[0 .. $]. Of course, if you don't want to allow 
aribrary slices, then perhaps toSlice is better.



More information about the Digitalmars-d-learn mailing list