DIP 1027--String Interpolation--Final Review Feedback Thread

Dennis dkorpel at gmail.com
Mon Feb 3 11:54:51 UTC 2020


On Thursday, 30 January 2020 at 09:47:43 UTC, Mike Parker wrote:
> This is the feedback thread for DIP 1027, "String 
> Interpolation".

> is the proposed feature specified in sufficient detail?

Some things are not clear to me.
- Is InterpolatedString meant to be added to 
TemplateSingleArgument?
Since it is a PrimaryExpression, it is already allowed as a 
template argument in brackets.
- How does that work? Especially regarding this:

> InterpolatedExpresssions undergo semantic analysis similar to 
> MixinExpression

The specification does not describe MixinExpressions in much 
detail:
https://dlang.org/spec/expression.html#mixin_expressions

It is unclear why interpolated strings use the same rules.
alias T = int; mixin(T, " a = 8", -3, ";"); would result in "int 
a = 8-3;" being mixed in since T becomes T.stringof and -3 
becomes "-3".
Turning arguments into strings at compile time does not make 
sense for an interpolated string though.

- The 'Concatenation' section is not very specific.
It does not mention whether the interpolated string grammar is 
parsed on the raw string literal or the escaped string literal, 
e.g. can I do `i"""\x24apples"` instead of i"$apples"?
It also does not mention the type of the resulting string literal 
when wstrings or dstrings are appended to the interpolated string.

- The way Character, CharacterNoBraces and CharacterNoParen are 
used in the grammar is ambiguous.
It allows unlimited non-{} / non-() characters, meaning $ and " 
are allowed too.
The grammar in its current form can produce any nonsensical 
string (e.g. i"i$)("$i}{"""$$$}}}{}") with near-arbitrary choice 
for FormatString and Argument placement.
If ambiguity were resolved by picking the first option, then 
rules for FormatString and Argument are unreachable.

Suggested actions: Specify how interpolated strings work as 
template arguments, fix the grammar, and clarify the 
Concatenation section.

#############

> are edge cases, flaws, and risks identified and addressed?

It is claimed that:
> It also makes interpolated strings agnostic about what the 
> format specifications are.
> The meaning of the format specifications is unknown to the core 
> language.

This is simply false, because the question "why is %s inserted by 
default" can only be answered with "that is Phobos' format 
specifier convention". Any attempt at a format function that has 
no special meaning for % will be at a disadvantage.
Consider this example:
```D
int s = 3;
format(i" 8%s = 8%$s ");
// = format(" 8%s = 8%%s ", 3);
// = " 83 = 8%s ";
```
Here the variable s got formatted at completely the wrong place 
because a %s was already there and the %s that the interpolated 
string inserted got escaped. The DIP identifies "Mixing 
Conventional Format Arguments With Interpolated Strings" as a 
limitation but does not address the fact that the current design 
requires the programmer to take special note of $ and % and the 
format string convention, or errors might occur, some of which 
are undetectable at compile time or run time.

- Nested interpolated strings are not considered. Is i"$(i"$x")" 
an error because the second " ends the literal early? Is 
i""`$(i"$x")` equal to tuple("%s", "%s", x), or is this not 
allowed? Failure / succes cases of using nested interpolated 
strings should be explored to determine whether it is allowed or 
not.

- It is weird to me that interpolated strings are not compatible 
with the c, w and d postfixes.
It is inconsistent with every other string literal, and the claim 
that the postfixes are used rarely is not backed up by anything. 
I see ""w strings often being used with Windows API functions 
which use UTF-16.

Suggested actions: Change the design to be truly format specifier 
agnostic, specify how nested interpolated strings work, allow 
other encodings.

#############

> is there an implementation that proves the proposed feature 
> works in practice?

No. There is only an implementation of a different proposal:
https://github.com/dlang/dmd/pull/7988
Morover, the DIP does not explore use cases apart from printf and 
writef.
Many contexts that can benefit from interpolated strings 
(Exception / assertion messages, mixin, code generation for 
domain specific languages such as SQL) are not considered.

Suggested actions: Explore expected use cases and how 
interpolated strings will work in them, or let people toy with a 
prototype before settling on a final design.

#############

> does the DIP consider prior work from other languages?

A prior work section contains a link to the Wikipedia page on 
interpolated strings, and a few links to previous proposals, but 
no attempt is made to compare the proposed design with others. In 
the review summary some design goals are listed based on the 
discussion of the previous review round:

> the implementation must be compatible with BetterC, meaning 
> printf and similar C functions

Why should it work in BetterC when interpolated strings are not a 
feature in C?
Why is direct compatibility with printf needed, is printf 
actually used commonly in D code outside of dmd and BetterC code?
Why is a custom function that accepts an interpolated string and 
forwards it to printf not acceptable instead?

> the implementation must not trigger GC allocations
> the implementation must not depend on Phobos

I assume this is with reference to the idea "why no assignment to 
string".
It should be noted that nobody proposed interpolated strings 
always directly go to string, just that they may implicitly 
convert to string, meaning the counter arguments don't apply 
since interpolated strings can still be used in BetterC, just not 
the string conversion functionality.
The requirement 'must not depend on Phobos' should also be 
motivated, for example by links to bugzilla issues with problems 
that the ^^ operator (which depends on std.math: pow) has.

> the implementation must be performant

The proposed design is notably not optimal in performance. It 
does not work with the writef variant that takes that format 
string at compile time, meaning the format string must be parsed 
at runtime, and a FormatException might get thrown.

Suggested actions: enumerate and motivate design goals in 
Rationale section, explain why proposed design best fits those 
goals. Compare other proposed designs and motivate why the DIP's 
design wins.


More information about the Digitalmars-d mailing list