No struct extending?
Georg Wrede
georg.wrede at nospam.org
Mon Sep 11 05:45:19 PDT 2006
Steve Horne wrote:
> In C++, 'struct' is (almost) a synonym for 'class'. More a declaration
> of intent than a different thing. One useful side-effect of this is
> that you can declare structs as extensions of other structs. This can
> be useful even for plain-old-data. e.g. data structures with several
> node types, but some shared fields. Inheritance doesn't always imply
> virtual tables and stuff.
>
> Of course you can handle this using...
>
> struct s_Branch
> {
> s_Common m_Common;
>
> ... (branch specific stuff)
> }
>
> But all that 'pointer.m_Common.actual_member' is a pain.
>
> Adding a union...
>
> union u_Variant
> {
> s_Common m_Common;
> s_Branch m_Branch;
> s_Leaf m_Leaf;
> }
>
> just means you have to specify 'm_Branch' or 'm_Leaf' for the
> non-shared fields too.
>
> Anonymous structs and unions can save on these extra dots and
> identifiers, but they do a different job. They can't give you a family
> of structs. Only a single struct/union combo - a variant record.
>
> Plus, one thing I have in mind is templates that build up in layers
> (mix-in layers pattern), and there will be an arbitrary number of
> extensions applied. For example, if you want ordered keys, you apply
> the 'gimme-keys' template as a mixin layer, and it extends whatever
> structures, classes and methods it needs to.
>
> This can still be handled - you just compose an access class in
> parallel along with the structs. But it's a hassle.
>
> Now, truth told, this mix-in layers bit isn't important. 'static if'
> probably means I'm better off specifying things using mix-in layers
> (setting up flags and aliases), but putting most of the final code in
> one big template. It will be a lot more readable and maintainable that
> way. The mix-in layers pattern is then mostly a way of avoiding having
> too many parameters for one template.
>
> I could use a D mixin to define the common fields, of course. And
> whatever approach I take, there's pointer casting based on the
> run-time type so that's not a big deal, though the C++ approach is
> nice in that casting to the 'base struct' is implicit.
>
> What bothers me is the chance of this happening...
>
> struct c_Common
> {
> mixin(common fields)
> }
>
> struct c_Leaf
> {
> int m_Misplaced_Field; // whoops!
> mixin(common fields)
> }
>
> Or, for that matter, the same just using nested structs.
>
> struct s_Branch
> {
> int m_Misplaced_Field; // whoops!
> s_Common m_Common;
> }
>
> That is, there is no rule forcing the shared part into matched
> locations in all structures, so when you do the union/pointer
> casts/whatever you can end up looking at the wrong memory.
>
> So - am I being paranoid?
>
> It's a small thing, especially given the amount of work that D has
> already saved me. And I'm not even convinced it's real. The node
> layouts above, for instance, will all be part of the same module and
> maintained together anyway. The odds of a misplaced field like that
> should be next to zero, and the same probably applies in any family
> tree of related structs.
>
> I just thought I'd raise it and see what others think.
I suspect this is a perfect case of the Square Triangle. Happens to me
too. The solution sought being just a notch off the problem, concepts
fighting for neurons, and the goal but a mirage, elusive and yet so
tempting.
And then of course, I may be misunderstanding the whole issue, for all I
know.
Anyway, as I understood it, you have struct instances (from somewhere,
like a C library routine or a file, etc., let's call them
ForeignInstances) and you need to glue some new fields to them so that
you can process them without needing to write reams of code that keeps
track of what YourProperties belong to which item. And this has to work
with several (more or less) different, but still conceptually related
kinds of ForeignInstances.
One could use a concoction of templates, mixins, unions and inheritance
to create a module (or a library) which then lets one handle the
situation simply and cleanly in main code. (Either in current D, or
after Walter makes some needed tweaks.) Carefully writing the module
would let one be reasonably sure that the fields align right, maybe even
have suitable dynamic and/or static checks and asserts, to (almost)
enforce integrity.
The end result, or the goal, being that one ends up with in-memory
(let's call them) prints, the beginning of which is exactly the same as
in YourProperties and the rest the same as ForeignInstances.
Having got this far, one can then use functions from the module to
handle the non-instance-type-specific manipulation of the
ForeignInstances. Presumably one would either already have, or else
write specific routines to do all the actual instance type specific
stuff (mostly access and assignment).
You make it more difficult by bringing up the issue of physical field
alignment. (See
http://en.wikipedia.org/wiki/Fragile_binary_interface_problem which
incidentally is on a wrong page(!!) since it discusses the Fragile Base
Class problem. Oh well.)
I suspect you wanted to have all this more ambitious? Like having
several alternate sets of YourProperties (and of course matching code
sets), so that each set could be used for a different purpose, like
sorting, selecting, serializing, combining, printing, etc. of the "prints".
---
If I got a task like this, I'd probably do it a lot simpler.
First, these ForeignInstances have to enter our process somehow. At that
point we either have to recognize the specific type of each, or we know
it from the context. In both cases, if we wanted to "slap on" our own
fields to them, we need to copy them somewhere else in memory. (If we
only have a couple of such instances coming one at a time all this
effort is a waste, and if they come from a stream or a file then they're
too near each other to have room for our fields anyhow.)
Thus, one of the main points of physically having our own fields
attached to the ForeignInstances disappears: speed. And with sorting,
unless they're very small, it will be more practical to sort linked
lists of just pointers to them. So it seems some of the performance wins
just vanish.
The other point for having our and their fields together was "kinda
integrity", in other words, we wouldn't have to separately keep track of
our and their data. Now, from above, it seems that having our and their
data together at all times may actually incur more work for us than more
traditional programming.
This all means we simply have to have routines for import and export.
Now, the issues of machine word width and endianness, only exist when
carrying the data (or porting the program) between different
architectures. As to the data, simply exactly specifying the record
layout takes care of both word size and endianness. (Hey, a .gif file is
a .gif file no matter what computer you have.) As to within the program,
as long as we access the fields with their names (as opposed to bit
twiddling), we're okay. The process of importing or exporting converts
between the "file format" and our internal representation (whatever it
may be), and that process is where endianness and word size are take
care of.
Implementation I'd start with simple textbook OO. We (feel entitled to)
assume that the various ForeignInstance types do have much in common.
This automatically suggests a Base Class that has methods to store,
retrieve and manipulate the common fields. It would also have abstract
methods that cater for what has to be done with the instance type
specific fields, more or less constituting an interface specification in
reality.
Each ForeignInstance subtype would then just be a sublclass of this base
class, implementing only the specifics. Instances of these could then be
stored in data structures (arrays, lists, trees) and referenced to as
the base type. Thanks to polymorphism, we could then "just use" these
instances without worrying which specific type each is. We'd also get
adequate performance, without even trying.
Clear, concise and KISS. Oh, and way more robust and maintainable.
Ok, your original issue (as I understood it anyway) was not one of
practical implementation, rather a (in itself very intriguing) thoughlet
about the relationship of mixins, structs and their manipulation, both
in source code and at runtime. My point (and I apologize) was merely
that I can't see a suitable problem for your solution. :-)
More information about the Digitalmars-d
mailing list