string to char array?

via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Jun 6 11:43:06 PDT 2015


On Saturday, 6 June 2015 at 17:31:15 UTC, Kyoji Klyden wrote:
> On Saturday, 6 June 2015 at 10:12:54 UTC, Marc Schütz wrote:
>>> ...
>>
>> Almost correct :-) The part of "has nothing left, so go back" 
>> is wrong. The call to _d_arraybounds doesn't return, because 
>> it throws an Error.
>>
>>> ...
>>
>> Yes, inside the `f` function, the compiler cannot know the 
>> length of the array during compilation. To keep you from 
>> accidentally accessing invalid memory (e.g. if the array has 
>> only two elements, but you're trying to access the third), it 
>> automatically inserts a check, and calls that runtime helper 
>> function to throw an Error if the check fails. .L.str is most 
>> likely the address of the error message or filename, and 55 is 
>> its length. The 5/6/7 values are the respective line numbers. 
>> You can disable this behaviour by compiling with `dmd 
>> -boundscheck=off`.
>>
>
> Thanks for the reply!
>
> so I'm a tad unsure of what exactly is happening in this asm, 
> mainly because I'm only roughly familiar with x86 instruction 
> set.
>
> _d_arraybounds throws an error because it can't access the 
> runtime? or because as you said the compiler can't know the 
> length of the array?

_d_arraybounds() always throws an error because that's its 
purpose. It's implemented here:
https://github.com/D-Programming-Language/druntime/blob/master/src/core/exception.d#L640

My point was that _d_arraybounds never returns, instead it throws 
that Error object.

The compiler inserts the checks for the array length whenever you 
access an array element, _except_ if it can either prove that the 
array is always long enough (e.g. if its a fixed-size array), in 
which case it can leave the check out because it's unnecessary, 
or if it can prove that the array is never long enough, in which 
case it may already print an error during compilation.

>
> for .L.str, 55 is the length of the address..?

No, the length of the string.

It's roughly the equivalent of this pseudo-code:

extern void _d_arraybounds(void* filename_ptr, size_t 
filename_len, size_t line);

void f(void* a_ptr, size_t a_length) {
     if(a_length == 0)
         goto LBB0_4;
     *cast(int*) a_ptr = 0;      // line 5
     if(a_length <= 1)
         goto LBB0_5;
     *cast(int*) (a_ptr+4) = 1;  // line 6
     if(a_length <= 2)
         goto LBB0_6;
     *cast(int*) (a_ptr+8) = 1;  // line 7
     return;
LBB0_4:
     // (pretend this filename is 55 chars long)
     static string __FILE__ = "/path/to/your/source/file.d";
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 5 /* line 
number */);
LBB0_5:
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 6 /* line 
number */);
LBB0_6:
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 7 /* line 
number */);
}

>
>
>>> Also in the mov parts, is that moving 1 into the pointer or 
>>> into the rsi register? And is rsi + 4, still in rsi, or does 
>>> it move to a different register?
>>
>> It stores the `1` into the memory pointed to by `rsi`, or 
>> `rsi+4` etc. This is what the brackets [...] mean. Because 
>> it's an array of ints, and ints are 4 bytes in size, [rsi] is 
>> the first element, [rsi+4] the second, and [rsi+8] the third. 
>> `rsi+4` is just a temporary value that is only used during the 
>> store, it's not saved into a (named) register. This is a 
>> peculiarity of the x86 processors; they allow quite complex 
>> address calculations for memory accesses.
>
> Does the address just get calculated whenever the program using 
> this asm, then? :o

Yes, but it is extremely fast. I'm pretty sure accessing memory 
at [RSI] and [RSI+4] both take exactly the same time (but can't 
find a reference now).


More information about the Digitalmars-d-learn mailing list