Discussion Thread: DIP 1036--String Interpolation Tuple Literals--Community Review Round 2

Sun Jan 31 15:34:12 UTC 2021

On 1/31/21 1:24 AM, Walter Bright wrote:
> On 1/29/2021 7:06 AM, Steven Schveighoffer wrote:
>> The point of this is, we don't want to fit into printf-style formatting,
> 
> That is abundantly clear :-)
> 
> But it is a test on how powerful the feature is and where its limits are.

Yes, and it passes the test quite clearly -- it is possible to write a 
formatting function with this DIP.

>> DIP1027's syntax is designed to fit with a few existing functions.
> 
> More accurately, it is designed to fit with an enormous amount of 
> existing use. The printf format pattern is very common with other 
> functions - dmd's source code has many examples of it.

But it does not work with printf for strings. Only writef and format 
provide full compatibility.

And beside the point, this is not a feature that is only to be used by 
DMD and it's style of code. The "parse a string to figure out what to 
do" is vastly inferior to introspection. Making decisions on what to do 
and how to do it at compile time is D's specialty, why should we give up 
that power for a crippled runtime string (which can be manipulated at 
will by the function caller)?

> DIP1027 actually had zero knowledge of printf formatting other than %s 
> being the default format, which is shared with writefln. This was very 
> intentional in its design.

%s is wrong for all standard D types when talking about printf. DIP1027 
works with writef, and I consider that to be the goal, not printf.

> 
> 
>> It puts the burden on the user to make sure they understand how the 
>> rewrite will happen, and which functions to use to do this.
> 
> The rewrite is far simpler than #DIP1036's is, and no functions are used 
> to do it. Note how short DIP1027 is.

How is the rewrite simpler? DIP1036 does not have to parse format 
specifiers. It produces arguments intuitively in the order provided 
instead of rearranging all of them and providing a 
runtime-only-accessible blueprint on what to do with them.

>> It is not forgiving of mistakes, which will compile and do the wrong 
>> thing.
> 
> Since D now does printf format checking, format mistakes will no longer 
> compile. (We should have added that to D a decade ago! Dang I like that 
> feature.)

You know who doesn't care about that feature? The 99% of D developers 
that don't use printf.

You once said that Java IDEs that provide shortcuts to write boilerplate 
for you was a failure compared to templates. I'm saying that same thing now.

This does not help with any other kind of formatting. It doesn't help 
with writef. It doesn't help with mysql. It's a very specific feature. 
The D language is not made to cater to printf. And if DIP1027 is an 
example of that, it's a poor example as it doesn't work with strings.

> 
> 
>> But I don't want to rehash DIP1027 here, that discussion has already 
>> happened.
> 
> Since #DIP1027 was rejected, it is necessary for #DIP1036 to be a 
> substantial improvement over it, otherwise we just go sideways. 
> Comparisons are fair.

DIP1027 and DIP1036 have substantially different goals. DIP1027 is a 
mechanism to make calls to format-blueprint style functions possible to 
write in a different way. DIP1036 intends to make it possible to hook 
string interpolation, prevent incorrect usage, and if not, just convert 
it to a string.

But if you want to compare them, I can give you a list:

* printf
    DIP1036: Not usable out of the gate. Works with wrapper, which can 
provides introspection (no formatting specifiers needed, but are supported).
    DIP1027: Works with some types, but not strings. Provides no 
introspection. Easy to get wrong (though special compiler magic can 
prevent some errors). No overloads possible to correct these deficiencies.
* writef / format
   DIP1036: Can be used, but will convert to a string first. With an 
overload, full support for everything with compile-time error checking 
of formats.
   DIP1027: Works as long as you don't insert extra or incorrect format 
specifiers, and as long as the interpolation string is the first 
parameter, and there are no other parameters. No compiler magic to check 
format specifiers. No overloads possible to correct these deficiencies.
* sql "printf style" functions
   DIP1036: Can be used as a string only out of the gate. With an 
additional overload, can provide a rich intuitive experience (see DIP 
for examples).
   DIP1027: All placeholders must be spelled out, no introspection of 
data to form SQL query. No compiler magic to check for proper 
placeholders. Format placeholders not checkable at compile time. No 
overloads possible to correct these deficiencies.
* string assignment/argument
   DIP1036: done automatically, or explicitly with a minimal druntime 
function. Or one may use std.conv.text directly.
   DIP1027: available with a call to format with Phobos. No compiler 
magic verification of formatting specification - only checkable at 
runtime. No overload possible to correct these deficiencies.

Compare usage:

DIP1027 printf(i"Hello, %.*s${}(cast(int)name.length)${}(name.ptr), how 
are you today?\n");
DIP1036 iprintf(i"Hello, ${name}, how are you today?\n"); // requires 
shim that is betterC compatible

DIP1027 writef(i"Hex value of x is ${%x}(x)");
DIP1036 writef(i"Hex value of x is %x${x}"); // requires overload

DIP1027 writeln(i"Hello, $name");
DIP1036 writeln(i"Hello, ${name}");

DIP1027 mysql_exec("UPDATE tbl SET col1=${?}(col1), col2=${?}(col2), 
col3=${?}(col3) WHERE id = ${?}(id)");
DIP1036 mysql_exec("UPDATE tbl SET col1=${col1}, col2=${col2}, 
col3=${col3} WHERE id = ${id}"); // requires overload

DIP1027 throw new Exception(i"Error, name and age are not valid (name = 
$name, age = $age)");
DIP1036 throw new Exception(i"Error, name and age are not valid (name = 
${name}, age = ${age})");

Bonus: all of the above would compile with DIP1027. Spot the runtime 
issues with them.

> 
>> DIP1036 is designed to be used by library writers to provide a 
>> mechanism to distinguish and handle properly interpolation strings 
>> ONLY when intended,
> 
> The fallback to matching string parameters doesn't fit that, and isn't 
> different from DIP1027 in that regard.

The idup rewrite allows usage of string interpolations as strings where 
library writers have not written overloads to accept the expanded form. 
It doesn't change the fact that library writers are *provided* a mechanism.

In other words, the expanded form shouldn't match where it wasn't 
intended. And the string form is usable by users where strings are intended.

DIP1027 is different in that it uses common existing types that match 
many existing functions (even ones that are not intended to).

>> and do it with little effort, all while also providing the user a 
>> mechanism to seamlessly convert normal data into string data.
> 
> It pains me to say this, but would I use #DIP1036 in my own code? No. It 
> adds too many layers of abstraction, is hard to document, hard to 
> remember, adds special new rules for overloading, is unclear when it 
> uses the GC, I have to write wrappers to use it, and the user-facing 
> part just doesn't look good.

So you would prefer:

string s = format("%s: %s", name, val);

over

string s = i"${name}: ${val}";

It's easy to know when the GC is used. If it makes a string, the GC is 
used. If not, the GC is not (unless the function accepting the 
parameters does it).

How do you know whether a function accepts it as the expanded form? you 
read the documentation.

Do you have to write wrappers to use it? No. You can use it purely as a 
string builder.

> DIP1027 was a simple lexical rewrite - easy to remember, easy to 
> document, sensible defaults, no allocations, no function calls, no 
> overloading, no wrappers. It wasn't perfect, but simplicity and 
> predictability are huge advantages.

It is simple, unintuitive (the parameters are rearranged, placeholder 
defaults have nothing to do with which function is called), and usable 
only in a select few functions (which does not include printf IMO). 
Writing proper functions to accept them takes a perfectly usable 
ordering of string and expression data, and hides it behind a 
runtime-only accessible string. One which the user can override anywhere 
he wants, and only with printf does he get a compiler error for 
incorrect specifiers.

This is a very poor mechansim for a language built on top of 
introspection and metaprogramming.

> P.S. I used to write macros in C named "printf" that would muck about 
> with the arguments, add some logic, then call the real printf. After a 
> few years I got tired of them, and put them in a bag with some rocks and 
> threw it in the swamp.

I'm not sure we can get past this if you keep ascribing completely 
irrelevant anecdotes to this DIP. The printf shim I wrote is not *even 
close* to a C macro. And it's completely unneccesary for this DIP's 
acceptance. This DIP is not aimed at printf *at all*.

-Steve