Limited Semi-PolyMorphic (LSP) structs?
Era Scarecrow
rtcvb32 at yahoo.com
Mon Aug 26 04:20:13 PDT 2013
Been a while and out of the loop, I need to get my hands dirty
in the code again (soon). Anyways, let's get to the matter at
hand as I'm thinking about it. I'm working on some code (or will
work on code again) that could use a polymorphic type, but at the
end it will never be shared externally and can never extend
beyond 'Known' types. (and the 'types' are behavior 95% of the
time)
True polymorphism will let me compile a code to work with say
'Object' and you can plug in any code and pass it an object and
it will be happy, or inherit and work with it. These are great as
classes. But I want to throw a wrench in the gears, I want to use
structs.
I know you're asking, Why? Well the data structure I'll be
working with have records in the hundreds of thousands, but most
of the time looking at/sorting them I don't necessarily need to
allocate memory (or really even interact with them beyond
sorting). Most of the time the OO aspect is just behavior and the
data between them never changes (behave this way because you're a
type XXX). It can take a very long time to unpack everything
(when you never needed to unpack in the first place, or allocate
and then de-allocate immediately, probably without recycling).
Alright, structs can't inherit because they don't have an object
pointer to go to their parent structures that they are connected
to. If we eliminate the pointer and the overhead on some of that,
we bring it down to the following:
To be fully polymorphic you need to:
* Be able to tell what type it is (Probably enum identifier)
* Extended types cannot be larger than the base/original (or the
base has to hold extra data in reserve for it's other types).
When can this LSP be applicable?
* When a type can be changed on the fly with no side effects
other than intended (This 'text' struct is now 'left aligned text
object')
* When allocations/pointers aren't needed (why fully build the
object at all?)
* When it won't be extended beyond the current type... (only a
few fixed subtypes)
* When teardown/deallocation/~this() isn't needed (never
allocates? Never deallocates! Just throw it all away! POD's are
perfect!)
* When you have no extra 'data' to append to a class/struct and
only override methods/behavior (Encryption type switch from DES
to BlowFish! Same API/methods, otherwise nothing really changed)
* When you need a half step up from structs but don't need full
classes/objects
Limitations: Hmmm..
* LSP's cannot be a template or mixin (they can call on them)
* Known at compile-time
* Only one of them can handle setup/teardown.
Now, I wonder. What if we make it so it CAN inherit? We need 3
examples, where something is in A, A&B, and B. We will get to
that. iA and iB are the class/structs and A/B/C are the
methods/functions. So...
//original based on...
class iA {
void A();
void C();
}
class iB : iA {
void A();
void B();
}
Breaking them down for structs we have:
struct base_AB { //the 'manager' that does redirection and known
size
char type = 'A'; //polymorphic type identifier, for simplicity
//shared 'parent' baggage ie -> iA
//baggage space for iB
//Exists in both iA & iB
void A() {
switch(type) {
case 'A': (cast(iA) this).A(); break;
case 'B': (cast(iB) this).B(); break;
default: throw new Exception("method A() invalid for struct
type:" ~ type);
}
}
//B only exists in inherited
void B() {
if (type == 'B') (cast(iB) this).B();
else throw new Exception();
}
//C exists everywhere
void C() { (cast(iA) this).C(); }
}
So far the base is easy to make, now however as for how iA and
iB interact with eachother... Ignoring infinite loops problems:
struct iA {
char type;
//baggage - Size must be identical to base_AB
void A() {
A();
//B(); /*iA doesn't really know about B so can't call it from
here... not safely anyways*/
C();
};
}
Now, C() can be called regardless of the type (no checks
needed). iB.B() can also be called anywhere in iB with no
problems, but it would be simpler if all calls regardless went to
the base and let the base handle it. (Optimizations would
probably remove unneeded checks later).
struct iB {
char type; //and other baggage
void A() {
A(); // actually: (cast(base_AB) this).A();
B(); // actually: (cast(base_AB) this).B();
C(); // actually: (cast(base_AB) this).C();
}
void B() {
//ditto as A()'s
}
}
This can probably be easily done having a different naming set
which handles the redirection, but the names and boiler-plate
seems unnecessarily large and verbose for a simple task;
Optimization will likely remove the excess unneeded portions in
final binary form anyways.
//different name examples:
struct base_AB {
char type; //and other baggage
void A() {
switch(type) {
case 'A': (cast(iA) this).realA(); break;
case 'B': (cast(iB) this).realB(); break;
default: throw new Exception("method A() invalid for struct
type:" ~ type);
}
}
struct iB {
char type;
//bagage
//boilerplate
void A() { cast(base_AB) this.A(); }
void B() { cast(base_AB) this.B(); }
void C() { cast(base_AB) this.C(); }
void realA() {
A();
B(); //Path really is now: this.B() -> base_AB.B() ->
iB.realB();
C();
}
void realB() { /* ... */ }
}
Now how would one build it without all the extra boilerplate
overhead (also preferably with fewer confusing elements) Three
ideas are coming to mind.
1) Template. This can work but it's a lot of overhead, a lot of
work and is hard to keep track of. I have an experimental version
that semi-works but can't be used seriously. My project sorta
relying on this has come to a halt for the moment. It's also very
finickle where data (and struct order) are involved and refuses
to compiler out of a certain order. Also for calling functions
that exist in current structs require you to forcibly call 'this'
or go around it in order for it to do the redirections elsewhere.
2) UDA's could be possible, if you tag a struct as belonging
and then scan/add the appropriate code. But this seems like a lot
of work and overhead. Probably won't work, plus a new UDA for
each type vs known fixed extensions. Seems like a lot of extra
work. Maybe?
struct base_AB @LSP(iA, iB) {}
struct iA @LSP_base(base_AB) {}
struct iB @LSP_base(base_AB) {}
3) Compiler. This seems the most likely (and 95% of it is
already in place for classes), but i can't implement it myself
easily; Plus i am not sure if Walter or Andrei would want/allow
it. If you squint your eyes you can see it looks very similar to
the original struct/class formula for C++ (without using
pointers/new), except that the total size reserved is known/set
in advance to incorporate all types (fixed size, no
growing/shrinking by casting).
Perhaps introducing a third base type (so struct, classes and
LSP's)? This might help but depends on the amount of extra
complexity would be needed, or how many bugs it could
introduce... or incorporate it as the normal struct but is a
experimental/disabled feature unless you apply a certain
attribute (or set of?)
If there's a suggestion of how to make this work and feel more
natural (class-like without allocation/pointers) then
suggestions/ideas would be nice. It's possible this type of
feature is just not wanted. But I'm willing to bet there's quite
a few places this type of feature/set could be utilized when you
consider what it can be used for.
More information about the Digitalmars-d-learn
mailing list