Interpolated strings and SQL

Steven Schveighoffer schveiguy at gmail.com
Tue Jan 9 22:30:26 UTC 2024


At the end of the day, DIP1027 is an improvement of `writef`, and 
`writef` only (not even `printf` works correctly). The 
interpolation DIP Atila is writing (I'll call it IDIP) supports 
all manner of interpolated transformations, efficiently and 
effectively, with proper compiler checks.

Let's go through the points made...

On Monday, 8 January 2024 at 23:06:40 UTC, Walter Bright wrote:
> Here's how SQL support is done for DIP1036:
>
> https://github.com/adamdruppe/interpolation-examples/blob/master/lib/sql.d
>

...

> This:
>
> 1. The istring, after converted to a tuple of arguments, is 
> passed to the `execi` template.

Yes, and with an explicit type to be matched against, enabling 
overloading. Note that `execi` could be called the same thing as 
the normal execution function, and then users could use whatever 
form they prefer -- sql string + args or istring. It's a seamless 
experience.

Compare to DIP1027 where you can accidentally use the wrong form 
with string args.

> 2. It loops over the arguments, essentially turing it 
> (ironically!) back into a format
> string. The formats, instead of %s, are ?1, ?2, ?3, etc.

There is no formatting, sqlite does not have any kind of format 
specifiers.

No, it is not "turned back" into a format string, because there 
was no format string to begin with. The sql is *constructed* 
using the given information from the compiler clearly identifying 
which portions are sql and which portions are parameters.

And the SQL query is built at compile time, not runtime (as 
DIP1027 *must do*). This incurs no memory allocations at runtime.

> 3. It skips all the Interpolation arguments inserted by DIP1036.

Sure, those are not necessary here. Should be a no-op, as no data 
is actually passed.

> 4. The remaining argument are each bound to the indices 1, 2, 
> 3, ...

Yes.

> 5. Then it executes the sql statement.

Yes.

>
> Note that nested istrings are not supported.

Note that nested istrings can be *detected*.

And they are not supported *as explicitly specified*! This is not 
a defect or limitation but a choice of the particular example 
library.

Noting this "limitation" is like noting the limitation that `void 
foo(int)` can't be called with a `string` argument.

> Let's see how this can work with DIP1027:
>
> ```d
> auto execi(Args...)(Sqlite db, Args args) {
>     import arsd.sqlite;
>
>     // sqlite lets you do ?1, ?2, etc
>
>     enum string query = () {
>         string sql;
>         int number;
>         import std.conv;
>         auto fmt = args[0];
>         for (size_t i = 0; i < fmt.length, ++i)
>         {
>             char c = fmt[i];
>             if (c == '%' && i + 1 < fmt.length && fmt[i + 1] == 
> 's')
>             {
>                 sql ~= "?" ~ to!string(++number);
>                 ++i;
>             }
>             else if (c == '%' && i + 1 < fmt.length && fmt[i + 
> 1] == '%')
>                 ++i;  // skip escaped %
>             else
>                 sql ~= c;
>         }
>         return sql;
>     }();
> ```


As mentioned several times, this fails to compile -- an enum 
cannot be built from the runtime variable `args`.

Now, you can just do this *without* an enum, and yes, it will 
compile, build a string at runtime, and you are now at the mercy 
of the user to not have put in specialized placeholder (poorly 
named as a "format specifier" in DIP1027 because it is solely 
focused on writef). No compiler help for you!

To put it another way, you have given up complete control of the 
API of your library to the compiler and the user. Instead of 
understanding what the user has said, you have to guess.

And BTW, this is valid SQL:

```sql
SELECT * FROM someTable WHERE fieldN LIKE '%something%'
```

Which means, the poor user needs to escape `%` in a way 
completely unrelated to the sql language *or* the istring 
specification, something that IDIP doesn't require.

This is a further burden on the user that is wholly unnecessary, 
just because DIP1027 decided to use `%s` as "the definitive 
~~placeholder~~ format specifier".

> ```d
>     auto statement = Statement(db, query);
>     int number;
>     foreach(arg; args[1 .. args.length]) {
>         statement.bind(++number, arg);
>     }
>
>     return statement.execute();
> }
> ```
> This:
>
> 1. The istring, after converted to a tuple of arguments, is 
> passed to the `execi` template.

A tuple with an incorrect parameter that needs runtime 
transformation and allocations.

> 2. The first tuple element is the format string.
> 3. A replacement format string is created by replacing all 
> instances of "%s" with
> "?n", where `n` is the index of the corresponding arg.

SQL doesn't use format strings, so the parameter must be 
transformed at runtime using memory allocations.

And it does this without knowing whether the "%s" came from the 
"format string" or from a parameter.

Not to mention the user can pass in other "format specifiers" at 
will.

> 4. The replacement format string is bound to `statement`, and 
> the arguments are bound
> to their indices.

Maybe. sqlite frowns upon mismatching arguments because the 
library decided your search string was actually a placeholder in 
some unrelated domain specific language (the language of 
`writef`).

> 5. Then it executes the sql statement.

Maybe.

>
> It is equivalent.

It is most certainly not. The two are only slightly comparable. 
IDIP is a mechanism for an SQL library author (and many other 
domains, see Adam's repository) to effectively and gracefully 
consume succinct and intuitive instructions from a user to avoid 
SQL injections, and use the compiler to weed out problematic 
calls.

Whereas DIP1027 is a loaded footgun which is built for `writef` 
that can be shoehorned into an SQL lib, which necessitates 
allocations and all checks are done at runtime.

-Steve


More information about the Digitalmars-d mailing list