Memory/Allocation attributes for variables
Elmar
chrehme at gmx.de
Tue Jun 1 00:36:17 UTC 2021
Good questions :-) .
On Monday, 31 May 2021 at 18:56:44 UTC, Ola Fosheim Grøstad wrote:
> On Monday, 31 May 2021 at 18:21:26 UTC, Elmar wrote:
>> I wonder how programming languages don't see the obvious, to
>> consider memory safety as a part of type safety
>> (address/allocation properties to be type properties) and that
>> memory unsafe code only means an incomplete type system.
>
> All high level programming-languages do. Only the low level
> don't, and that is one of the things what makes their type
> systems unsound.
I suppose you mean the "higher" level languages (because C is by
original definition also a high-level language). I neither know
any "higher" level language which provides the flexibility of
constraining the value domain of a pointer/reference except for
restricting `null` (non-nullable pointers are probably the most
simple domain constraint for pointers/references). I think, not
even Ada nor VHDL have it.
The thing I'd like to gain with those attributes is a guarantee,
that the referenced value wasn't allocated in a certain address
region/scope and lives in a lifetime-compatible scope which can
be detected by checking the pointer value against an interval or
a range of intervals. For example a returned reference to an
integer could have been created with "malloc" or even a C++
allocator or interfacing functions could annotate parameters with
such attributes.
With guarantees about the scope of arguments function
implementations can avoid buggy reference assignments to outside
variables. The function could expect compatible references
allocated with GC but the caller doesn't know it. Whether any
reference variable assignment is legitimate can be checked by
comparing the source attributes (the reference value which says
where the value is allocated) with the destination attributes
(where the reference is stored in memory). Even better are
runtime checks of pointer values for a better degree of memory
safety but only if the programmers want to use it. A reference
assignment is legitimate if the destination scope is compatible
with the source's scope, not in any other case. I would suggest a
lifetime rating for value addresses as follows:
*peripheral > system/kernal > global shared > private global
(TLS) > extern global (TLS) > shared GC allocated > shared
dynamically allocated > GC allocated (TLS) > dynamically
allocated (TLS) <=> RAII/scoped/stack <=> RAII/scoped/stack >
register*
Heap regions are not always comparable to stack or RAII. So the
current practice of not allowing assignment to RAII references
(using `scope` attribute) is probably best to continue.
Everything other than stack addresses are seen as one single
lifetime region with equal lifetime. The comparison between stack
addresses assumes that an address deeper in the stack has a
higher or equal lifetime. The caller could also provide it's
stack frame bounds which allows to consider this interval as one
single lifetime.
It should constrain the possible value domain of pointers
absolutely so that no attack with counterfeited pointers to
certain memory addresses is possible. If I would use custom
allocators for different types I could expect or delimit what the
pointer value can be.
On Monday, 31 May 2021 at 18:56:44 UTC, Ola Fosheim Grøstad wrote:
>> constraints. Memory safety is violated by storing a pointer
>> value in a reference which is out of the intended/reasonable
>> value domain of the pointer (not matching its lifetime).
>
> But how do you keep track of it without requiring that all
> graphs are acyclic? No back pointers is too constraining.
>
> And no, Rust does not solve this. Reference counting does not
> solve this. How do you prove that a graph remains fully
> connected when you change one pointer?
I think, this is GC-related memory management, not type checking.
The memory attributes don't solve memory management problems. The
problem with reference counting usually is solved by inserting
weak pointers into cycles (which also solves the apparent
contradiction of a cycle of references). Weak references are used
by those objects which are deeper in the graph of data links.
Otherwise it's a code smell and one could refactor the links into
a joint object and deleted objects will deregister in this joint
object. I already thought about other allocation schemes for
detecting cycles that could be combined with reference counting.
For example tagging structs/classes with the ID of the conntected
graph in which they are linked if they aren't leaves. But this ID
is difficult to change. It can also analyze at compile time which
pointers can only be part of a cycle but more explanation leads
to far here.
Instead the problem, my idea is intended to solve, is
1. giving hints to programmers (to know which kind of allocated
memory works with the implementation, stack addresses apparently
won't generally work with `map` for example)
2. having static or dynamic (simple) value domain checks (which
checks whether a pointer value is in the allowed interval(s) of
the allocation address spaces belonging to the attributes) which
ensures that only allowed types of allocation are used. These
checks can be used to statically or dynamically dispatch
functions. Of course such a check could also be performed
manually but it's tedious and requires me to put all different
function bodies in one `static if else`.
It's more of a lightweight solution and works like an ordinary
type check (value-in-range check).
Where the feature shines most is function signatures because they
separate code and create intransparency which can be countered by
memory attributes for return type and argument types.
On Monday, 31 May 2021 at 18:56:44 UTC, Ola Fosheim Grøstad wrote:
>> One important aspect which I forgot: aliasing of variables. I
>> know, D allows aliased references as arguments by default.
>> Many memory safety problems derive from aliased variables
>> which were not assumed to be aliased.
>
> So, how do you know that you don't have aliasing when you
> provide pointers to two graphs? How do you prove that none of
> the nodes in the graph are shared?
Okay, I didn't define aliasing. With "aliasing" I mean that
"aliasing references" (or pointers) either point to the exact
same address or that the immediately pointed class/struct
(pointed to by the reference/pointer) does not overlap. I would
consider anything else more complicated than necessary. The
definition doesn't care about further indirections. I often only
consider the directly pointed struct or class contiguous chunk of
memory as "the type". If I code a function, I'm usually only
interested in the top level of the type (the "root node" of the
type) and further indirections are handled by nested function
calls. For example it suffices, if two argument slices are not
overlapping. For that I only need to check aliasing as just
defined. If you really would like two arguments (graphs) to not
share any single pointer value I would suggest using a more
appropriate type than a memory attribute, a type which is
recursively "unique" (in terms of only using "unique pointers").
Do you think, it sounds like a nice idea to have a data structure
attribute `unique` next to `abstract` and `final` which
recursively guarantees that any reference or pointer is a unique
pointer?
If you are interested for a algorithmic answer to your questions,
then the best approach (I quickly can think of) is creating an
appropriate hash table from all pointers in one graph and testing
all pointers in the other graph against it (if I cannot use any
properties on the pointers' values, e.g. that certain types and
all indirections are allocated in specific pools). But that only
works with exactly equal pointer values.
More information about the Digitalmars-d
mailing list