The state of string interpolation...one year later

Sun Mar 17 14:20:20 UTC 2019

On Sunday, 17 March 2019 at 06:01:35 UTC, Jonathan Marler wrote:
> It's been about a year since I submitted an implementation for 
> interpolated strings:
>
> https://github.com/dlang/dmd/pull/7988
>
> In that time, various people have been popping up asking about 
> it. There has been alot of discussion around this feature on 
> the forums and the place we left off was with Andrei saying 
> that we should:
>
> * continue to explore alternative library solutions
> * focus on improving existing features instead of adding new 
> features
>
> At the request of Andrei, I implemented a small library 
> solution as well (https://github.com/dlang/phobos/pull/6339) 
> but the leadership never followed up with it.  And that's ok, 
> they only have so much time and they need to prioritize how 
> they feel is best.
>
> With that, I read through some discussion and thought it could 
> be helpful to summarize my thoughts on the matter since people 
> continue to ask questions about it.
>
> In my mind, there's really only one reason for string 
> interpolation...
>
>     Better Syntax
>
> In many ways syntax isn't that important.  There's alot of 
> subjectivity around it, but sometimes a change can make it 
> objectively better.  Any time you can make syntax objectively 
> better, you're making code easier to read, write and maintain.  
> Better syntax means it's easier to write "correct code" and 
> harder to write "incorrect code".
>
> I recall Atila arguing that the syntax without string 
> interpolation wasn't that bad. Then he provided this example 
> (https://forum.dlang.org/post/jahvdekidbugougmyhgb@forum.dlang.org):
>
>     text("a is", a, ", b is ", b, " and the sum is: ", a + b)
>
> Ironically, his example had a mistake, but it was hard to 
> notice. Look at the same example with string interpolation:
>
>     text("a is$a, b is $b and the sum is: $(a + b)")
>
> You could say that "better syntax" is one of the main reasons D 
> exists.  It's the main if not the only reason for alot of 
> features like UFCS and foreach.
>
> Andrei's biggest critique is that we should firt try to 
> implement this in a library...and he's completely right to ask 
> that question.  The problem is that over the years, no one's 
> been able to achieve a library solution that results in a nice 
> syntax.  Having a poor syntax is a bad sign for a feature that 
> only exists to improve syntax.  However, even if we could make 
> the syntax better, there are still a handful of reasons why a 
> library solution can't measure up to language support.  
> Including the "poor syntax", the following are the 5 CONS I see 
> with a library solution:
>
> CON 1. The syntax is "not nice".  This defeats the entire point 
> of interpolated strings. Saying that interpolated strings 
> aren't popular because people are not using a library for them 
> is like saying Elvis isn't popular because people don't like 
> elvis impersonators.  People not liking a poor imitation of 
> something doesn't say anything about how they feel about the 
> genuine article.
>
> CON 2. Real error messages.  What's one of the most annoying 
> parts of mixins?  Error messages.  When you get an error in a 
> mixin, you don't get a line of code to go fix, you get an 
> "imaginary" line that doesn't exist.  With library solutions, 
> you can't point syntax errors inside interpolated strings to 
> source locations.  That information is not available to the 
> language.  When you get a string, you don't know where each 
> character inside that string originated from, only the compiler 
> knows that.
>
> CON 3. Performance.  No matter what we do, any library solution 
> will never be as fast as a language solution. The reason why 
> performance is especially important here, is because bad 
> performance means developers will have to chose between better 
> syntax or faster compilation.  We already see this today with 
> templates and mixins.  With a language implementation, 
> developers can have both.
>
> CON 4. IDE/Editor Support.  A library solution won't be able to 
> have IDE/Editor support for syntax highlighting, auto-complete, 
> etc.  When the editor sees an interpolated string, it will be 
> able to highlight the code inside it just like normal code.
>
> CON 5. Full solution requires full copy of lexer/parser.  One 
> big problem with a library solution is that it will be hard for 
> a library to delimit interpolated expresions.  For full 
> support, it will need to have a full implementation of the 
> compiler's lexer/parser.  Without that, it will have 
> limitations on the kind of code that can be inside an 
> interpolated string.  Take the following (contrived) example:
>
> foreach (i; 0 .. 10)
> {
>     mixin(interp(`$( i ~ ")" ) entry $(array[i])`));
> }
>
> The library solution needs to parse that interpolated string 
> but needs to know that the right paren at `")"` is actually 
> just a string literal inside the expression and not a right 
> paren to delimit the end of the expression.  This is a 
> contrived example, but if you have anything less than a full 
> lexer/parser then developers are going to have a hard time 
> being able to know what can and can't go inside an interpolated 
> expression.  By having interpolated strings as a part of the 
> langugage, the implementation has full access to the 
> lexer/parser, so it doesn't need to force any limitation on the 
> syntax available inside interpolated string expressions.
>
> ---
>
> Now I'm not saying that the CONS of the library solution 
> justify the addition of interpolated strings to the language.  
> I focused on that because that is Andrei's main sticking point.
>  Even if everyone agrees that library solution's don't work 
> (and we can't enhance the language to make them work), we still 
> need to show that the feature is going to be popular/useful 
> enough to justify a new type of string literal.  The usefulness 
> of the feature needs to outweight the work to support it.  The 
> more features we add to D, the more developers need to learn to 
> understand it.  That being said, I consider the implementation 
> and complexity it adds to be quite minimal (see the PR for more 
> details). As for the usefullness, I can say personally I would 
> use this feature to replace almost all my usages of 
> writefln/format and writeln which would be a big shift for my 
> projects.  Instead of:
>
> writefln("My name is %s and my age is %s and my favorite hex is 
> %s", name, age, favnum);
>
> I will be writing:
>
> writeln(i"My name is $name and my age is $age and my favorite 
> hex $(favnum.formatHex)");
>
> When I generate code, instead of:
>
>     return
>         returnType ~ ` ` ~ name ~ `(` ~ type ~ ` left, ` ~ type 
> ~ ` right)
>         {
>             return cast(` ~ returnType ~ `)(left ` ~ op ~ ` 
> right);
>         }
>     `;
>  It will be
>
>      return text(iq{
>         $returnType $name($type left, $type right)
>         {
>             return cast($returnType)(left $op right);
>         }
>     });
>
> When I generate HTML documents in my cgi library, instead of:
>
>     writeln(`<html><body>
>     <title>`, title, `</title>
>     <name>`, name, `</name><age>`, age, `</age>
>     <a href="`, link, `">`, linkName, `</a>
>     </body></html>
> `);
>
> or even:
>
>     writefln(`<html><body>
>     <title>%s</title>
>     <name>%s</name><age>%s</age>
>     <a href="%s">%s</a>
>     </body></html>
> `, title, name, age, link, linkName);
>
> It will be:
>
>     writeln(i`<html><body>
>
>     <title>$title</title>
>     <name>$name</name><age>$age</age>
>     <a href="$link">$linkName</a>
>     </body></html>
> `);
>
> When I first saw interpolated strings I didn't immediately 
> realize the benefit of them.  Using them eliminates the problem 
> of keeping format strings in sync with arguments.  It also 
> avoids the "noise problem" you get when you alternate between 
> code and expressions inside a function call, i.e. `writeln("a 
> is", a, ", b is ", b)`. That pretty much sums up the benefits 
> in my mind.
>
> So what's next? I'm curious where leadership currently stands.  
> What's their thoughts on the library solutions that have been 
> presented? What do they think of the 5 CONS I've presented that 
> all library solutions will have?  What's their opinion on the 
> usefullness of the feature? For me personally, I am surprised 
> at the amount of interest this feature continues to garner.  I 
> think the feature is a net positive for D, but then again I 
> don't think it's a "make or break" feature.  Just a "nice 
> addition".  Anyway those are my thoughts. Sorry for the long 
> post.  I hope it's helpful and ultimately makes D better.

Seems you've done everything but write a DIP, I don't really see 
why this feature should be exempt from the process. Even if the 
process isn't the greatest, that isn't reason enough for it to 
circumvent it.