Why are void[] contents marked as having pointers?
Vladimir Panteleev
thecybershadow at gmail.com
Sun May 31 14:56:59 PDT 2009
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright <newshound1 at digitalmars.com> wrote:
> Vladimir Panteleev wrote:
>> I just realized that by "performance" you might have meant memory
>> leaks.
>
> No, in this context I meant improving performance by not scanning the
> void[] memory for pointers.
>> Well, sure, if you can say that my programs crashing every few
>> hours due to running out of memory is a "performance" problem. I'm
>> sorry to sound bitter, but this was the cause of much annoyance for
>> my software's users. It took me to write a memory debugger to
>> understand that no matter how much you chase void[]s with
>> hasNoPointers, there will always be that one ~ which you overlooked.
>
> I'm curious what form of data you have that always seem to look like
> valid pointers. There are a couple other options you can pursue - moving
> the gc pool to another location in the address space, or changing the
> alignment of your void[] data so it won't look like aligned pointers
> (the gc won't look for misaligned pointers).
It's just compressed data, which is evenly distributed across the 32-bit address space. Let's do the math:
Suppose we have an application which has two blocks of memory, M and N. Block M is a block with random data which is erroneously marked as having pointers, while block N is a block which shouldn't have any pointers towards it.
Now, the chance that a random DWORD will point inside N is sizeof(N)/0x100000000 - or rather, we can say that it will NOT point inside N with the probability of 1-(sizeof(N)/0x100000000). For as many DWORDs as there are in M, raise that to the power sizeof(M)/4. For values already as small as 1 MB for M and N, it's pretty much guaranteed that you'll have pointers inside N. Relocating or re-aligning the data won't help - it won't affect the entropy or the value range.
> Or just use ubyte[] instead.
And the casts that come with it :(
>> As much as I try to look from an objective perspective, I don't see
>> how a memory leak (and memory leaks in D usually mean that NO memory
>> is being freed, except for small lucky objects not having bogus
>> pointers to them) is a problem less significant than an obscure case
>> that involves allocating a void[], storing a pointer in it and losing
>> all other references to the object.
>
> Because one is an obvious failure, and the other will be memory
> corruption. Memory corruption is pernicious and awful.
It is, yes. But if you add "don't put your only references inside void[]s" to the "don'ts" on the GC page, the programmer will only have himself to blame for not reading the language documentations. This goes right along with other tricks IMHO.
>> In fact, I just searched the D
>> documentation and I couldn't find a statement saying whether void[]
>> are scanned by the GC or not. Enter mr. D-newbie, who wants to write
>> his own network/compression/file-copying/etc. library/program and
>> stumbles upon void[], the seemingly perfect
>> abstract-binary-data-container type for the job... (which is exactly
>> what happened with yours truly).
>> P.S. Not trying to push my point of view, but just trying to offer
>> some perspective from someone who has been bit by this design
>> choice...
>
> Hmm. Wouldn't compression data be naturally a ubyte[] type?
That's a subjective opinion :) I could just as well continue arguing that void[] is the perfect type for any kind of "opaque" binary data due to its properties.
--
Best regards,
Vladimir mailto:thecybershadow at gmail.com
More information about the Digitalmars-d
mailing list