Wanted: Format character for source code literal
Q. Schroll
qs.il.paperinik at gmail.com
Fri May 7 23:51:13 UTC 2021
On Thursday, 6 May 2021 at 08:49:16 UTC, Berni44 wrote:
> On Wednesday, 5 May 2021 at 19:53:10 UTC, Q. Schroll wrote:
>> The new `format` implementation could do three things when
>> encountering `%D` for formatting an object of a type with
>> custom formatting:
>
> For me, this seems to be the wrong way to think about it.
> `format` doesn't encounter specifiers, but objects (in the
> wider sense). And in case of structs, classes and so on it
> delegates the handling of formatting to them, without even
> looking at the specifier (with the exception of `%s` which
> sometimes plays a special role).
The role of `%s` is special, but not too special either. It just
gives a best effort result where other formats would just fail.
The task to return a string representation that can be
interpreted back is nothing to be delegated to a user-defined
routine.
> It's then up to that struct or class to define the meaning of
> `%D` for that specific struct or class.
This makes `%D` unreliable for meta-programming. And this is
_the_ problem I have with this, because creating a
compiler-readable string from an object is a meta-programming
tool. I have no idea _what else_ you'd even do with it.
Here's the showstopper: Adding a `toString` that accepts format
specifiers becomes a potentially breaking change as it will
change the meaning of `%D` silently.
>> Because `%D` for `bool`, integers ([...]), `floats`, arrays,
>> and AAs is nothing different from `%s`.
>
> That's not true: bytes need a cast, longs a trailing 'L',
It depends what you want to do with it. If you want the immediate
type of the literal to be what you plugged in, then yes. If being
equal suffices, `"1"` and `"true"` are the same.
> like reals, floating point numbers are truncated with `%s` and
> don't provide the correct value
_That,_ on the other hand, _is_ a problem. I don't know how big
that problem practically is because `real` cannot even be
formatted at CTFE and `double` and `float` aren't that common of
things at compile-time. I guess the only sane result for floating
point values is `%a` with sufficient digits anyways and that is
largely apart from `%s` even if you add a gigantic precision.
It's a breaking change fixing `%s` for floating point values in
the sense that the representation consists of enough decimals to
accurately represent the number.
> and so on. There are a lot of subtle differences
The problem of strings and chars is obvious, the case for exact
types is, too. Floating point types didn't cross my mind, but
please elaborate, what else is it? I'm honestly interested.
If `%(%s%)` does not give you proper char or string, I'd consider
it a bug.
> and that's why I think it would be a good thing to have this
> new format character.
I agree with you that a new format is necessary to achieve this
if done with a format character to begin with. I do question
whether format characters are the right approach. To me, this
looks more like a code generation tool than value formatting.
>> The only part where you'd need something different than `%s`
>> is characters, strings. That would be handy to have, I must
>> admit. You can mimic it using arrays tho
>
> That was actually the starting point for me that led me to a
> desire for having `%D`: `%s` for arrays tries to mimic the
> intended result of `%D` (but fails at several places to do so
> correctly) and therefore treats characters and strings special.
> This led to the abuse of the `-`-flag (in `"%-(...%)`) which
> now causes a lot of problems. I thought long about how this
> could be fixed: With `%D` available, there would be a smoother
> transition be possible, because people using `%s` inside of
> `%(...%)` could just replace it with `%D` to get the current
> result and that eventually will make it possible to give `%s`
> (and the `-`-flag) its correct meaning back. (Of course this
> still needs deprecation cycles and maybe a preview switch or
> what else - it's still not easy.)
The `%-(...%)` a hack, but it can be questioned whether removing
it is even worth the trouble. It just breaks things. The minus
has otherwise no meaning for arrays. It's just weird.
>> And it's almost perfect! It works for character types, numeric
>> types, arrays, and AAs, too.
>
> As I wrote above: That might look so at first sight, but it
> isn't the case.
Right. I was a little enthusiastic about it.
>> The `$` only has that meaning if it's preceded by a number.
>> `%`*N*`$`*…c* has a meaning for *N* a number and *c* a
>> character possibly preceded by other formatting stuff. But
>> `%$` is undefined in the sense that it is an error to use it.
>
> But people will start to use it with width and other parameters
> and will report issues. Let along, that it will complicate the
> format spec parser significantly and thus might even introduce
> more bugs. I'm sorry, but with `%$` you'll opening the box of
> pandora.
It requires a single check: Is the `%` character followed by `$`?
The whole point of `%$` would be that it is not customizable. You
cannot add any specification. If something comes before `$`, it
isn't `%$`, and if something comes behind, it's not part of the
format specifier, but just text.
---
I've been thinking about this a little. What is your goal? Maybe
we're talking at cross purposes. I guess you want a format
specifier that formats any _built-in_ type in a way that
represents the object precisely. In a sense, you want a good `%s`
and not a not-really-the-best-effort `%s`. My understanding was
you want to represent objects as strings in a way that can be
used by the compiler to reconstruct the object, and for what else
than meta-programming would one do that? It's in a sense trivial
for built-in types because it's a finite set of types.
Thinking about it, you can easily wrap objects in a struct and
make it do The Right Thing™. It doesn't complicate the `format`
implementation.
More information about the Digitalmars-d
mailing list