More radical ideas about gc and reference counting

Mon May 12 04:14:10 PDT 2014

On 05/12/2014 02:50 AM, Walter Bright wrote:
> On 5/11/2014 1:59 PM, Timon Gehr wrote:
>> On 05/11/2014 10:05 PM, Walter Bright wrote:
>>> That's clearly an additional benefit of the borrowed pointer notion. But
>>> have you examined generated Rust code for the cost of inc/dec? I
>>> haven't, but I don't see any way they could avoid this (very expensive)
>>> cost without borrowed pointers.
>> Sure, but performance is the additional benefit.
>
> One constant theme in this thread, one I find baffling, is the regular
> dismissal of the performance implications of inc/dec.

Irrelevant, I'm not doing that.

(And again, reference counting is not the only allocation mechanism in 
Rust. AFAICT, most allocations use owned pointers.)

> Borrowed pointers are not necessary to support raw pointers  - this can be (and is in some
> systems) supported by simply wrapping the raw pointer with a dummy
> reference count.
> ...

I have no idea what this part is trying to bring across.

> The reason for borrowed pointers is performance.

No, it is safety. Raw pointers give you all of the performance.

> Rust would be non-viable without them.
>

True in that it would fail to meet its design goals. Rust provides a 
tracing garbage collector as well, so it is at least as viable as D 
regarding performance of safe memory management. (Probably more, 
actually, because it does not provide precision-unfriendly constructs 
such as undiscriminated unions.)

> I strongly suggest writing a snippet in [[insert your favorite proven
> technology RC language here]] and disassembling the result, and have a
> look at what inc/dec entails.
> ...

I don't have trouble seeing the cost of reference counting. (And it's 
you who claimed that this is going to be the only memory allocation 
scheme in use in Rust code.)

>
>>> The thing is, if the compiler is capable of figuring out these lifetimes
>>> by examining the code,
>>
>> There are explicit lifetime annotations in function signatures.
>
> Yes, because the compiler cannot figure it out itself, so the programmer
> has to annotate.
> ...

You are saying 'if the compiler is capable of figuring out these 
lifetimes by examining the code, then ...' and then you are saying that 
the compiler is incapable of figuring them out itself. What is it that 
we are arguing here? Are you saying the Rust compiler should infer all 
memory management automatically or that it cannot possibly do that, or 
something else?

>
>> It is simply not true that type systems are inherently restricted to
>> checking
>> trivial properties. They can be made as strong as mathematical logic
>> without
>> much fuss.
>
> Again, Rust would not need borrowed pointers nor the annotations for
> them if this knowledge could be deduced by the compiler. Heck, if the
> compiler can deduce lifetimes accurately,

It does not deduce anything. It checks that borrowed pointers do not 
outlive their source. Lifetime parameters are used to transport the 
required information across function signatures.

> you can get rid of GC and RC,
> and just have the compiler insert malloc/free in the right spots.
> ...

That's a form of GC, and I already acknowledged that global region 
inference exists, but noted that this is not what is used.

> Note that there is a Java version that does this partway, sometimes it
> will replace a GC object with a stack allocated one if it is successful
> in deducing that the object lifetime does not exceed the lifetime of the
> function.
> ...

I know. Also, inference is harder to control and less efficient than 
simply making the relevant information part of type signatures.

http://en.wikipedia.org/wiki/Region-based_memory_management#Region_inference

"This work was completed in 1995[9] and integrated into the ML Kit, a 
version of ML based on region allocation in place of garbage collection. 
This permitted a direct comparison between the two on medium-sized test 
programs, yielding widely varying results ("between 10 times faster and 
four times slower") depending on how "region-friendly" the program was; 
compile times, however, were on the order of minutes."

>
>>> Yes, one is & and the other is @.
>>
>> No, actually currently one is & and the other is RC<T> AFAIK.
>
> Then Rust changed again. The document I read on borrowed pointers was
> likely out of date, though it had no date on it.
> ...

Yes, most documents on Rust are at least slightly out of date.

>
>> RC<T> is not more general. It cannot refer to stack-allocated data,
>> for instance.
>
> So there is no general pointer type that has an unbounded lifetime?
> ...

How can it be general and have an unbounded lifetime and be safe?

>
>> Sure, borrowing is very lightweight, but ultimately what is most
>> important is
>> that it solves the problem of multiple incompatible pointer types and
>> makes the
>> type system more expressive as well.
>
> Adding more pointer types makes a type system more expressive, by
> definition.
> ...

No. Also, this is irrelevant, because I was highlighting the 
_importance_ of the fact that it does in this case.

>
>> A function that uses none of the specific pointer capabilities is more
>> general,
>> so what other choice of 'default' makes sense?
>
> A function that doesn't have restrictions on what can be done with the
> pointers passed to it.

You mean a function that only accepts GC-managed pointers?
(Not having _any_ restrictions at all does not make any sense.)
I don't see how this is better, because such a function is usable in 
less contexts.

> Borrowed pointers have restrictions on their
> usage - this is explicitly stated in the Rust documentation.
> ...

Values of every type have restrictions on their usage. If your function 
requires less capabilities from it's arguments, then it is usable in 
more cases. Why is this not the most sensible thing to do whenever possible?

>
>> Convenience and reusable functions means using borrowed pointers
>> whenever possible.
>
> Of course. And writing fast code means making your functions fast
> whenever possible!
> ...

What is your point here?

>
>>> It seems clear that the decisions of borrow/managed are going to pervade
>>> Rust code.
>>
>> But they are often obvious.
>
> I've written a lot of ref counted code in the past, enough to know that
> such is a pretty optimistic statement. Dealing with was a significant
> and ongoing drain on my time, especially when the programs and data
> structures got more complex.
> ...

You said you don't have any experience writing Rust code, and claimed 
that it is not a proven technology, so I fail to see how you can have 
any relevant experience with borrowing.

>...
>
>> Borrowed pointers are not even superficially similar to near*. They are
>> compatible with everything else, because they can store data that was
>> borrowed
>> from anywhere else.
>
> As long as those pointers don't escape. Am I right in that one cannot
> store a borrowed pointer into a global data structure?

Sure. If you are going to escape a pointer, you need to think about how 
that memory is managed. That's how it is. And there are many options: 
You could simply transfer ownership, you might use reference counting, 
or a tracing GC.

> The similarity is
> that there are one way conversions from one to the other, and one of the
> types is more general.

And the difference is that the one you can convert to is faster here.

> I infer from your other statements about Rust
> that it doesn't actually have a general pointer type.

Memory can be borrowed no matter how it was allocated. That's what I 
call general. How do you fill that term?