Request for pre-review: std.serialization/orange

Robert Jacques sandford at jhu.edu
Mon Oct 3 06:57:25 PDT 2011


On Mon, 03 Oct 2011 02:38:22 -0400, Jacob Carlborg <doob at me.com> wrote:
> On 2011-10-02 00:52, Robert Jacques wrote:
>> On Sat, 01 Oct 2011 07:18:59 -0400, Jacob Carlborg <doob at me.com> wrote:

[snip]

>>>> Also by the time archiving is called, isSliceOf should always return
>>>> false.
>>>
>>> Why is that?
>>
>> If isSliceOf can return true, then that means that the archive is
>> responsible for alias detection, management, etc.
>
> No, who says that. You can take this struct and use it outside of this
> library, it knows nothing about archiving or serialization. If the
> isSliceOf method should return false when archiving has been called I
> would need to add logic to detect when serialization and archiving has
> begun and ended.

So, in essence, you are saying that by the time archiving occurs, isSliceOf will always return false? Then why is it part of the public API?

>> That means that every
>> single archive format must implement an alias resolution algorithm.
>
> No, the serializer performs this task.

Okay.

[snip]

>>> I do. How would I otherwise discover if an array is a slice of another
>>> array or not?
>>
>> Okay, first some rational. Consider:
>>
>> assert(!a.isSliceOf(b));
>> assert(!b.isSliceOf(a));
>> assert( c.isSliceOf(a));
>> assert( c.isSliceOf(b));
>>
>> and
>>
>> class Foo {
>> float x;
>> float[3] point;
>> }
>>
>> void main() {
>> auto foo = new Foo;
>> auto ptr = &foo.x;
>> auto slice = point[0..2];
>> }
>>
>> In the first case, a, b and c are all slices of a common root array, but
>> the root array may not be serialized. In the second case, first you have
>> a pointer to the inside of an object and second you have a slice of a
>> static array inside an object, all three of which may be serialized
>> together. My impression from your API (so this might not be correct) is
>> that currently, you can't handle the above use cases. Even if you can,
>> an O(N^2) algorithm is rather inefficient.
>
> This is how it works: As the first step all arrays are serialized as
> regular arrays and not slices. After all serialization is done I loop
> over all arrays and check if they are a slice of some other array. If I
> found a match I replace the serialized array with a slice instead.
>
> These arrays are stored as an associative array with the type Array[Id].
> I don't know if there's a better data structure for this.

I presented one below.

>> The solution, in my mind, is to think in terms of memory blocks/chucks.
>> Every reference can be thought as pointing to a memory chunk defined by
>> two pointers and a flag:
>>
>> {
>> void* head; // Start of the memory chunk
>> void* tail; // End of the memory chunk
>> bool hasAliases; // True if there are more than one reference to this chunk
>> }
>>
>> For alias detection / resolution, you build a balanced tree of memory
>> chunks, widening the chunk and flagging hasAliases as appropriate. Which
>> should give you O(N log(N)) performance.
>
> I'm not sure I understand. That would require that the arrays are stored
> in a continues block of memory? Won't "head" and "tail" always point to
> start of the array and the end of the array?

Most of the time yes. But not all of the time. It would be fair to say that 'head' and 'tail' would be inside the GC memory region of the array and are <= or >= of the start/end of an array, respectively. More importantly, both objects and pointers would resolve to memory chunks as well and be included in the alias resolution algorithm. I'm assuming you currently separate object and array alias resolution and don't handle pointers at all.


More information about the Digitalmars-d mailing list