compile time compression for associatve array literal
Brian Tiffin
btiffin at gnu.org
Mon Aug 23 14:04:05 UTC 2021
On Monday, 23 August 2021 at 11:53:46 UTC, ag0aep6g wrote:
> On 23.08.21 08:14, Brian Tiffin wrote:
>> From ~~a~~ little reading, it seems associative array literal
>> initialization is still pending for global scope, but allowed
>> in a module constructor? *If I understood the skimming
>> surface reading so far*.
>>
>> ```d
>> immutable string[string] things;
>> static (this) {
>> things = ["key1": "value 1", "key2": "value 2"];
>> }
>> ```
>
> (Typo: It's `static this()`.)
>
Yep, that's a typo.
>> Is there a magic incantation that could convert the values to
>> a `std.zlib.compress`ed ubyte array, at compile time? So the
>> object code gets keys:compvals instead of the full string
>> value?
>
> There's a big roadblock: std.zlib.compress cannot go through
> CTFE, because the source code of zlib isn't available to the
> compiler; it's not even D code.
>
> Maybe there's a CTFE-able compression library on dub. If not,
> you can write your own function and run that through CTFE.
> Example with simple run-length encoding:
>
> ----
> uint[] my_compress(string s)
> {
> import std.algorithm: group;
> import std.string: representation;
> uint[] compressed;
> foreach (c_n; group(s.representation))
> {
> compressed ~= [c_n[0], c_n[1]];
> }
> return compressed;
> }
>
> string my_uncompress(const(uint)[] compressed)
> {
> import std.conv: to;
> string uncompressed = "";
> for (; compressed.length >= 2; compressed = compressed[2 ..
> $])
> {
> foreach (i; 0 .. compressed[1])
> {
> uncompressed ~= compressed[0].to!char;
> }
> }
> return uncompressed;
> }
>
> import std.array: replicate;
>
> /* CTFE compression: */
> enum compressed = my_compress("f" ~ "o".replicate(100_000) ~
> "bar");
>
> immutable string[string] things;
> shared static this()
> {
> /* Runtime decompression: */
> things = ["key1": my_uncompress(compressed)];
> }
> ----
>
> If you compile that, the object file should be far smaller than
> 100,000 bytes, thanks to the compression.
Cool. So, is might not be obvious, but there is a path to this
little nicety.
>
> [...]
>> I'm not sure about
>>
>> a) if code in a module constructor is even a candidate for
>> CTFE?
>
> The word "candidate" might indicate a common misunderstanding
> of CTFE. CTFE doesn't look for candidates. It's not an
> optimization. The language dictates which values go through
> CTFE.
>
> In a way, static constructors are the opposite of CTFE.
> Initializers in module scope do go through CTFE. When you have
> code that you cannot (or don't want to) put through CTFE, you
> put it in a static constructor.
>
> You can still trigger CTFE within a static constructor by other
> means (e.g., `enum`), but the static constructor itself is just
> another function as far as CTFE is concerned.
Ok. I'm hoping this gets easier to reason with once I get
further up the D curve.
>
>> b) what a cast might look like to get a `q"DELIM ... DELIM"`
>> delimited string for use as input to std.zlib.compress?
>
> A cast to get a string literal? That doesn't make sense.
No, no it doesn't. And it didn't help that I had the order of AA
key and value syntax backwards in my head when I was typing in
the question. I was thinking it was `key[value]`, not the proper
`value[key]`.
So in this case, `(ubyte[])[string]` was what I *think* I'd be
aiming for as the AA type spec. The inputs to compress are
`const(void)[]`, so I figured I needed to cast the type inferred
literal delimited string for use in compress. More things to
learn. ;-)
I cannot claim to be on solid ground of understanding when it
comes to some areas of D syntax yet.
>
> You might be looking for `import("some_file")`. That gives you
> the contents of a file as a string. You can then run that
> string through your compression function in CTFE, put the
> resulting compressed data into the object file, and decompress
> it at runtime (like the example above does).
That's the goal. It's an optional goal at this point. I'm not
*really* worried about size of object code, yet, but figured this
would be a neat way to shrink the compiled code generated from
some large COBOL source fragments embedded in D source.
COBOL programmer me might have planned to run the fragments
through a compressor, then copy those outputs to the D source by
hand, but that would be a maintenance headache and make for far
less grokkable code.
Thanks for the hints, ag0aep6g. You've given me some more paths
to explore.
Have good.
More information about the Digitalmars-d-learn
mailing list