Array efficiency & rendering buffers

Derek Parnell derek at psych.ward
Wed Jun 7 16:00:53 PDT 2006


On Thu, 08 Jun 2006 08:08:20 +1000, Henrik Eneroth  
<Henrik_member at pathlink.com> wrote:



> Could I use plain arrays to achieve this functionality, or will such an
> implementation make it crawl at a snails pace compared to it's older  
> C++ cousin?
> Passing an array to a function and then using its return would involve  
> quite a
> bit of copying, no? Should I pass around pointers to arrays, or maybe  
> just plain
> pointers?

In D, the only method you have to pass variable-length arrays is by  
reference. You cannot pass such an array by value even though the syntax  
*looks* like that's what you are doing. However the implementation of  
variable-length arrays is such that they have two components: the array  
data and the array descriptor (or reference). The array data is simply the  
amount of contiguous RAM needed to hold all the elements and this is  
allocated on the heap, and the descriptor is a two-member struct-like  
entity of eight bytes ...

   {
      int Length;
      void *Data;
   }

When you call a function using the array name as an argument, the compiler  
is actually passing the descriptor and not the data itself. Same with  
returning an array.

   char[] String = "some data in a char array";
   char[] Result = "init value";
   Result = foo(String);

In this case, the 'foo' function receives the array descriptor for  
'String' and returns an array descriptor that overwrites the current array  
descriptor for 'Result'. The original data referred to by Result is now  
unreferenced and will be automatically deallocated by the garbage  
collector at some stage.

   char[] foo(char[] arg)
   {
      for(int i = 0; i < arg.length; i++)
      {
         if (arg[i] == 'a')
              arg[i] = 'A';
      }
      return arg;
   }

As the 'foo' function receives an array descriptor, it is free to modify  
the data belonging to the array. However as it receives this arg as an  
'in' (by default) type of parameter passing it means that 'foo' does not  
pass back any modifications to the 'arg' array descriptor to the caller.  
As it stands, after this example 'foo' is called, the String and Result  
variables will both point/reference the exact same piece of RAM (data  
elements).

   Result[0] = 'Q';
   writefln("%s", String);

  ==> output is "Qome dAtA in A chAr ArrAy"

To avoid this effect, if its undesirable, you need to implement the  
Copy-on-Write paradigm. This means that if 'foo' changes the data it  
should return a new array descriptor that now references a newly allocated  
piece of RAM.

   char[] foo(char[] arg)
   {
      bool IsModified = false;

      for(int i = 0; i < arg.length; i++)
      {
         if (arg[i] == 'a')
         {
              if (! IsModifed)
              {
                // Allocate new ram, copy data and
                // force a new descriptor to be created
                arg = arg.dup;

                IsModified = true;
              }
              arg[i] = 'A';
         }
      }
      return arg;
   }

(Yes this is just an example and not very efficient, okay!)
Note that there is a simpler way to avoid this effect with out doing CoW  
semantics but at the cost of copying data.

    Result = foo(String).dup;

But if you want the 'foo' function to be able to change either the length  
of the array then you need to pass the array with the 'inout' qualifier.  
In that case the address of the array descriptor is passed and D will work  
with it indirectly.

So in summary, passing D arrays can be very quick and does not always  
involve data copying.

-- 
Derek Parnell
Melbourne, Australia



More information about the Digitalmars-d-learn mailing list