Why are void[] contents marked as having pointers?

Vladimir Panteleev thecybershadow at gmail.com
Sun May 31 14:56:59 PDT 2009


On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright <newshound1 at digitalmars.com> wrote:

> Vladimir Panteleev wrote:
>> I just realized that by "performance" you might have meant memory
>> leaks.
>
> No, in this context I meant improving performance by not scanning the  
> void[] memory for pointers.

>> Well, sure, if you can say that my programs crashing every few
>> hours due to running out of memory is a "performance" problem. I'm
>> sorry to sound bitter, but this was the cause of much annoyance for
>> my software's users. It took me to write a memory debugger to
>> understand that no matter how much you chase void[]s with
>> hasNoPointers, there will always be that one ~ which you overlooked.
>
> I'm curious what form of data you have that always seem to look like  
> valid pointers. There are a couple other options you can pursue - moving  
> the gc pool to another location in the address space, or changing the  
> alignment of your void[] data so it won't look like aligned pointers  
> (the gc won't look for misaligned pointers).

It's just compressed data, which is evenly distributed across the 32-bit address space. Let's do the math:

Suppose we have an application which has two blocks of memory, M and N. Block M is a block with random data which is erroneously marked as having pointers, while block N is a block which shouldn't have any pointers towards it.
Now, the chance that a random DWORD will point inside N is sizeof(N)/0x100000000 - or rather, we can say that it will NOT point inside N with the probability of 1-(sizeof(N)/0x100000000). For as many DWORDs as there are in M, raise that to the power sizeof(M)/4. For values already as small as 1 MB for M and N, it's pretty much guaranteed that you'll have pointers inside N. Relocating or re-aligning the data won't help - it won't affect the entropy or the value range.

> Or just use ubyte[] instead.

And the casts that come with it :(

>> As much as I try to look from an objective perspective, I don't see
>> how a memory leak (and memory leaks in D usually mean that NO memory
>> is being freed, except for small lucky objects not having bogus
>> pointers to them) is a problem less significant than an obscure case
>> that involves allocating a void[], storing a pointer in it and losing
>> all other references to the object.
>
> Because one is an obvious failure, and the other will be memory  
> corruption. Memory corruption is pernicious and awful.

It is, yes. But if you add "don't put your only references inside void[]s" to the "don'ts" on the GC page, the programmer will only have himself to blame for not reading the language documentations. This goes right along with other tricks IMHO.

>> In fact, I just searched the D
>> documentation and I couldn't find a statement saying whether void[]
>> are scanned by the GC or not. Enter mr. D-newbie, who wants to write
>> his own network/compression/file-copying/etc. library/program and
>> stumbles upon void[], the seemingly perfect
>> abstract-binary-data-container type for the job... (which is exactly
>> what happened with yours truly).
>>  P.S. Not trying to push my point of view, but just trying to offer
>> some perspective from someone who has been bit by this design
>> choice...
>
> Hmm. Wouldn't compression data be naturally a ubyte[] type?

That's a subjective opinion :) I could just as well continue arguing that void[] is the perfect type for any kind of "opaque" binary data due to its properties.

-- 
Best regards,
 Vladimir                          mailto:thecybershadow at gmail.com



More information about the Digitalmars-d mailing list