Feedback Thread: DIP 1036--String Interpolation Tuple Literals--Community Review Round 2

Steven Schveighoffer schveiguy at gmail.com
Fri Jan 29 00:53:20 UTC 2021


On 1/28/21 2:39 AM, Walter Bright wrote:
> #DIP1036
> 
> Full Disclosure: I am not favorably disposed to this, as it is fairly 
> complicated and uses the GC.

I hope to alleviate your concerns, from the responses below, it seems 
like I have poorly conveyed the intentions of the DIP in many parts.

>  > It can bind to a parameter list, but it does not have a type by itself.
> 
> Makes no sense. What is it doing by "binding" to a parameter list? The 
> examples make no sense, either, because assert doesn't have a parameter 
> list.

Forgive my ignorance of the language spec and terminology. I want to say 
that basically if you write:

foo(i"Hello, ${name}")

It translates to:

foo(interp!"Hello, "(), name, interp!""())

Unless that doesn't match a valid overload, and if not, then it 
translates to:

foo(idup(interp!"Hello, "(), name, interp!""()))

Clearly I don't know how to say that properly. I'm thinking of a new way 
to say this with overloads (see overload blurb below). Hopefully this is 
better.

On assert, the fact that it won't match the expanded form means it will 
use the idup rewrite. That is intentional. I will make it clear that the 
rewrite will only happen for function or template argument lists.

> 
>  > idup
> 
> What does this function look like?

The signature would look like:

S idup(Args...)(Args args) if (is(Args[0] : interp!S, S))

And it would be roughly equivalent to std.conv.text, but without much of 
the cruft of phobos (likely it reuses some features already in druntime, 
such as miniFormat).

>  > requires the GC
> 
> D needs to move away from such constructs.

First, the DIP only requires the GC if idup is used. SOME form of 
allocation is needed.

Follow the logic: you need a string from a set of arguments. This set of 
arguments is only knowable at runtime. Therefore you need a runtime 
allocation to hold the resulting string. Where should that allocated 
space come from?

There is no possible string interpolation feature that results in an 
actual string that can be done without either adding a new allocation 
scheme to the language (i.e. reference counting), or using the GC.

And it was very clear from the previous review, a string interpolation 
feature that cannot simply be assigned to or used as a string is a 
failed feature.

> 
>  > interp and idup
> 
> Not clear when interp is called and when idup is called.

See overload blurb below.

> 
>  > With proper library definitions, if usage of a string interpolation 
> is an error, this DIP does not specify the language of the error 
> condition. It is our preference that the resulting error of the idup 
> call is emitted instead of the failed sequence match.
> 
> Finish this rather than hand wave.

I can do this even though it's an implementation detail.

> 
>  > functions which accept interp literals
> 
> what are "interp literals" ?

That should say InterpolationLiterals as defined in the description. 
It's an instantiation of the `interp` struct.

>  > Because the type interp!"..." is not implicitly convertible to any 
> other type
> 
> Why wouldn't it be?

I don't understand the question. D does not allow implicit conversion of 
library types without either alias this or inheritance.

> 
>  > This design is intentional to trigger the implicit idup call whenever 
> it is used for conventional string-accepting functions.
> 
> I don't know how this might fit in with overload resolution.

See my blurb about overload resolution below.

> 
>  > "Best effort" functions
> 
> I don't know what the definition of "best effort" is when applied to a 
> function.

Functions that accept any and all types of arguments, like writeln, and 
use a best effort to do something with them. These will never trigger 
the idup rewrite, which is why I talk about them in the DIP.

You may roughly define a best effort function as one that accepts a 
vararg template parameter, and has no template constraints related to 
that list.

> 
>  > What became clear as the prior version was reviewed was that the 
> complexity of specifying format while transforming into a parameter 
> sequence was not worth adding to the language.
> 
> I didn't think that was the conclusion. This DIP is much more complicated.

I disagree. This DIP is much simpler to use. It may be more complicated 
to implement, but that doesn't matter to the user of the language.

The overload resolution is likely the only truly complex part to 
implement, since the rules are not easy to fit into the existing ones. 
The translation of the literal to InteropolationLiterals and expressions 
should be actually simpler than the previous DIP because no formatting 
specifiers are involved.

> 
>  > Because the interp template type will provide a toString member, it 
> will pass properly to functions such as writeln or text and work as 
> expected without any changes to the existing functions.
> 
> It won't work generally, however:
> 
>      void foo(string);
>      struct S { string toString(); }
>      void test() { S s; foo(s); }
> 
> fails.

I'm not sure if you understand the point of the statement. Functions 
such as writeln or text will work with interpolation literals. There is 
no attempt to say that it works with all functions, or that functions 
which accept strings will work with all types that define a toString member.

However, this will work with your foo and S above:

void test() { S s; foo(i"${s}"); }

> 
>  > To pass two sequential interpolation strings to a function that 
> accepts interpolation strings, concatenation is not needed—separating 
> the string literals by a comma will suffice.
> 
> This will have weird consequences for overloading, i.e. distinguishing 
> one combined argument from two distinct arguments.

Identifying specific weird consequences would be most helpful.

>  > The complete specification of these translations is left up to the 
> eventual implementors and language maintainers.
> 
> In my experience, doing the detail design of things often reveals a 
> fatal flaw.

We are willing to write a library implementation for discussion. But the 
actual implementation does not affect the DIP. We are 100% confident an 
implementation of idup is possible (simply for the fact that 
std.conv.text exists).

> 
>  > Compiler implementation
> 
> This section appears to confuse a definition of of the feature with its 
> implementation. It really should be labeled "Overload Resolution".

OK, thank you for giving me the correct term! And also, this is a better 
frame of view than what I originally wrote from. See my new suggestion 
below.

> 
> I am totally confused why it refers to InterpolationString for matching 
> purposes, and yet says InterpolationSequence and InterpolationLiteral 
> are used for function overloading. Can't have it both ways.

I'll make sure this is clearer.

> 
>  > with no further attempt to rewrite the sequence.
> 
> Does that mean there are multiple rewrites under other conditions?

No. The point of this clarification is because the idup rewrite itself 
still is a function call that goes through the overload rules. I do not 
want to get into a recursive situation in the compiler where it tries 
foo(<expanded form>) then foo(idup(<expanded form>)), which for some 
reason doesn't match, and then tries foo(idup(idup(<expanded form>))) etc.

The idup rewrite should contain no further possibility of rewriting.

> 
>  > In the case where it does not match, the InterpolationString will be 
> rewritten as a call to a druntime library function named idup.
> 
> Does this imply a two-pass approach to overload resolutions? Try and 
> fail, then try again with rewrites?

My intention was for this to happen. But it only fails and tries the 
rewrite if there is no match (for function and template argument lists).

> 
>  > If multiple InterpolationString tokens are used in a parameter list, 
> the call must match for the resulting expansion of all 
> InterpolationString tokens, or the entire expression will fail to match.
> 
> Which expansion, as there are two different expansions?

If you pass multiple interpolation string parameters into a function, 
then either all must be expanded or all must be rewritten to idup calls. 
There cannot be a mix of both rewritten or expanded forms matching.

> 
> What about variadic parameters? Lazy parameters?

Good point on variadic parameters. We think they should not match the 
expanded form. The point here is that the function is likely not 
equipped to handle these things, and so passing a string instead will be 
more compatible. If you want to match the expanded form, you must use a 
variadic template.

Are there different overload rules for lazy parameters? I would expect:

foo(lazy string s)
bar(Args...)(lazy Args args) if (is(Args[0] : interp!S, string S))

to both accept string interpolation literals the same as the non-lazy 
equivalents would.

> 
> No examples given of trivial and non-trivial overload matches 
> illustrating each step of this process.

I will add this.

> 
> The reason I'm being pedantic on the overloading is we've done hand-wavy 
> overload rules before (alias this, cough cough) and eventually found out 
> it was unworkable.

I'm sorry for not being more detailed here. I am not experienced in the 
underlying details of overloads. In particular I would like to know 
cases that break this scheme either by making something not match when 
it should, or by using the wrong mechanism than is expected.

I can appreciate the point of view from the compiler side, and it's 
something we are lacking in experience. I am mostly focused on 
usability. I want to get it right, so that it's feasible to implement, 
whatever that takes.

-- Redo using Overloading instead of Compiler Implementation

I propose that instead of discussing the compiler implementation (that 
clearly was a mistake), the DIP should discuss the usage within the 
context of the existing overload rules.

Here is what I would propose:

1. If a StringInterpolation token appears anywhere other than an 
argument to a function call or template, the idup rewrite is always 
done. This includes for assert and mixin.
2. If a StringInterpolation token appears in an argument list to a 
function or template, the compiler shall try overloads with the 
StringInterpolation token expanded into InterpolationLiteral and 
Expression data. If there are any matches to the call, overload 
resolution processes as normal, and no rewrite is performed.
3. If no matches are found in step 2, then the compiler retries the 
overload search substituting a call to idup with the sequence for each 
of the parameters.

I will have to come up with a list of examples to clarify.

-Steve


More information about the Digitalmars-d mailing list