enum Format
Steven Schveighoffer
schveiguy at gmail.com
Thu Jan 11 02:22:18 UTC 2024
On Wednesday, 10 January 2024 at 19:53:48 UTC, Walter Bright
wrote:
>> And you can get rid of the runtime overhead by adding a
>> `pragma(inline, true)` `writeln` overload. (I guess with DMD
>> that will still bloat the executable,
>
> I didn't mention the other kind of bloat - the rather massive
> number and size of template names being generated that go into
> the object file, as well as all the uncalled functions
> generated only to be removed by the linker.
Yes, DIP1036e has a lot of extra templates generated, and the
mangled name is going to be large.
Let's skip for a moment the template that writeln will generate
(which I agree isn't ideal, but also is somewhat par for the
course).
This shouldn't be a huge problem for the interpolation *types*
because the type doesn't get included in the binary. It is a big
problem for the `toString` function, because that *is* included.
However, we can mitigate the ones that return `null`:
```d
string __interpNull() => null;
struct InterpolatedExpression(string expr)
{
alias toString = __interpNull;
}
... // and so on
```
I tested this and it does work. So this reduces all the
`toString` member functions from `InterpolatedExpression` (and
`InterpolationPrologue` and `InterpolationEpilog`, but those are
not templated structs anyway) to one function in the binary.
But we can't do this for `InterpolatedLiteral` (which by the way
is improperly described in Atila's DIP, the associated `toString`
member function should return the literal).
We can do possibly a couple things here to mitigate:
1. We can modify how `std.format` works so it will accept the
following as a `toString` hook:
```d
struct S
{
enum toString = "I am an S";
}
```
This means, no function calls, no extra long symobls in the
binary (since it's an enum, it should not go in), and I think
even the compilation will be faster.
2. We modify it to be aware of `InterpolationLiteral` types, and
avoid depending on the `toString` API. After all, we own both
Phobos and druntime, we can coordinate the release.
And as a further suggestion, though this is kind of off-topic, we
may look into ways to have templates that *don't* make it into
the binary explicitly. Basically, they are marked as shims or
forwarders by the library author, and just serve as a way to
write nicer syntax. This could help in more than just the
interpolation DIP.
>
> As far as I can tell, the only advantage of DIP1036 is the use
> of inserted templates to "key" the tuples to specific
> functions. Isn't that what the type system is supposed to do?
> Maybe the real issue is that a format string should be a
> different type than a conventional string.
No. While I agree that having a different *type* makes it more
useful and easier to hook, there is a fundamental problem being
solved with the compile-time literals being passed to the
function. Namely, tremendous power is available to validate,
parse, prepare, etc. string data at compile time, for use during
runtime. This simply *is not possible* with 1027.
The runtime benefits are huge:
* No need to allocate anything (`@nogc`, `-betterC`, etc. all
available)
* You get compiler errors instead of runtime errors (if you put
in the work)
* It's possible generate "perfect forwarding" to another function
that does use another form. For example, `printf`.
* If you inline the call, it can be as if you called the
forwarded function directly with the exactly correct parameters.
And I want to continue to point out, that a constructed "format
string" mechanism just is inferior, regardless if it is another
type, as long as you don't need formatting specifiers (and
arguably, it's just a difference in taste otherwise). The
compiler parsed it out, it knows the separate pieces. Giving
those pieces directly to the library is both the most efficient
way, and also the most obvious way. The "format string"
mechanism, while making sense for writef, *must* add an element
of complexity to the receiving function, since it now has to know
what "language" the translated string is. e.g. with DIP1027, one
must know that `%s` is special and what it represents, and the
user must know to escape `%s` to avoid miscommunication. With
1036e, there is no format string, so there is no complication
there, or confusion. The value being passed is right where you
would expect it, and you don't have to parse a separate thing to
know.
Note in YAIDIP, this was done partly through an interpolation
header, which had all the compile-time information, and then
strings and interpolated data were interspersed. I find this also
a workable solution, and could even do without the strings being
passed interspersed (as I said, we have control over `writeln`
and `text`), but I think the ordering of the tuple to match what
the actual string literal looks like is so intuitive, and we
would be losing that if we did some kind of "format header"
mechanism.
-Steve
More information about the Digitalmars-d
mailing list