struct vs class for a simple token in my d lexer
Artur Skawina
art.08.09 at gmail.com
Tue May 15 02:33:26 PDT 2012
On 05/14/12 17:10, Roman D. Boiko wrote:
> (Subj.) I'm in doubt which to choose for my case, but this is a generic question.
>
> http://forum.dlang.org/post/odcrgqxoldrktdtarskf@forum.dlang.org
>
> Cross-posting here. I would appreciate any feedback. (Whether to reply in this or that thread is up to you.) Thanks
>
If you have to ask such a question, then the answer doesn't matter.
Having said that, and w/o looking at the code, just based on the info
in this thread - a few thoughts, which may or may not be helpful:
- Use an array of pointers to struct tokens. IOW 'Token*[] tokens;'.
Same use-syntax. Much less large continuous allocations. One indirection
per token lookup, but that's no worse than using classes, but w/o the
class overhead.
- Don't let yourself be talked into adding unnecessary bloat, like storing
location inside Tokens. If calculating it on demand turns out to be costly
you can always store the location in a separate array (your array-of-structs
scheme would make the lookup trivial, array-of-*-structs would make this
slightly more complex, but if the info really is rarely needed, then it
doesn't matter)
- You can compress the string, if the 16 bytes is too much. Limiting the source
unit size and max token length to 4G should be acceptable; then just have
struct TokenStr { uint off, len; auto get(){return source[off..off+len];} etc
which will reduce the memory footprint, while adding very little overhead. It's
not like these strings will be used repeatedly. Don't forget to keep a reference
to 'source' around, so that it isn't collected by the GC.
(I'd even go for a 24:8 split, then you probably need to handle string literals
longer then 255 bytes specially, but it's likely worth it. If somebody produces
a 16M+ D source module he deserves the build failure. ;) )
I'm assuming the Token structs don't need destructors.
artur
More information about the Digitalmars-d-learn
mailing list