Interpolated strings and SQL
Steven Schveighoffer
schveiguy at gmail.com
Tue Jan 9 22:30:26 UTC 2024
At the end of the day, DIP1027 is an improvement of `writef`, and
`writef` only (not even `printf` works correctly). The
interpolation DIP Atila is writing (I'll call it IDIP) supports
all manner of interpolated transformations, efficiently and
effectively, with proper compiler checks.
Let's go through the points made...
On Monday, 8 January 2024 at 23:06:40 UTC, Walter Bright wrote:
> Here's how SQL support is done for DIP1036:
>
> https://github.com/adamdruppe/interpolation-examples/blob/master/lib/sql.d
>
...
> This:
>
> 1. The istring, after converted to a tuple of arguments, is
> passed to the `execi` template.
Yes, and with an explicit type to be matched against, enabling
overloading. Note that `execi` could be called the same thing as
the normal execution function, and then users could use whatever
form they prefer -- sql string + args or istring. It's a seamless
experience.
Compare to DIP1027 where you can accidentally use the wrong form
with string args.
> 2. It loops over the arguments, essentially turing it
> (ironically!) back into a format
> string. The formats, instead of %s, are ?1, ?2, ?3, etc.
There is no formatting, sqlite does not have any kind of format
specifiers.
No, it is not "turned back" into a format string, because there
was no format string to begin with. The sql is *constructed*
using the given information from the compiler clearly identifying
which portions are sql and which portions are parameters.
And the SQL query is built at compile time, not runtime (as
DIP1027 *must do*). This incurs no memory allocations at runtime.
> 3. It skips all the Interpolation arguments inserted by DIP1036.
Sure, those are not necessary here. Should be a no-op, as no data
is actually passed.
> 4. The remaining argument are each bound to the indices 1, 2,
> 3, ...
Yes.
> 5. Then it executes the sql statement.
Yes.
>
> Note that nested istrings are not supported.
Note that nested istrings can be *detected*.
And they are not supported *as explicitly specified*! This is not
a defect or limitation but a choice of the particular example
library.
Noting this "limitation" is like noting the limitation that `void
foo(int)` can't be called with a `string` argument.
> Let's see how this can work with DIP1027:
>
> ```d
> auto execi(Args...)(Sqlite db, Args args) {
> import arsd.sqlite;
>
> // sqlite lets you do ?1, ?2, etc
>
> enum string query = () {
> string sql;
> int number;
> import std.conv;
> auto fmt = args[0];
> for (size_t i = 0; i < fmt.length, ++i)
> {
> char c = fmt[i];
> if (c == '%' && i + 1 < fmt.length && fmt[i + 1] ==
> 's')
> {
> sql ~= "?" ~ to!string(++number);
> ++i;
> }
> else if (c == '%' && i + 1 < fmt.length && fmt[i +
> 1] == '%')
> ++i; // skip escaped %
> else
> sql ~= c;
> }
> return sql;
> }();
> ```
As mentioned several times, this fails to compile -- an enum
cannot be built from the runtime variable `args`.
Now, you can just do this *without* an enum, and yes, it will
compile, build a string at runtime, and you are now at the mercy
of the user to not have put in specialized placeholder (poorly
named as a "format specifier" in DIP1027 because it is solely
focused on writef). No compiler help for you!
To put it another way, you have given up complete control of the
API of your library to the compiler and the user. Instead of
understanding what the user has said, you have to guess.
And BTW, this is valid SQL:
```sql
SELECT * FROM someTable WHERE fieldN LIKE '%something%'
```
Which means, the poor user needs to escape `%` in a way
completely unrelated to the sql language *or* the istring
specification, something that IDIP doesn't require.
This is a further burden on the user that is wholly unnecessary,
just because DIP1027 decided to use `%s` as "the definitive
~~placeholder~~ format specifier".
> ```d
> auto statement = Statement(db, query);
> int number;
> foreach(arg; args[1 .. args.length]) {
> statement.bind(++number, arg);
> }
>
> return statement.execute();
> }
> ```
> This:
>
> 1. The istring, after converted to a tuple of arguments, is
> passed to the `execi` template.
A tuple with an incorrect parameter that needs runtime
transformation and allocations.
> 2. The first tuple element is the format string.
> 3. A replacement format string is created by replacing all
> instances of "%s" with
> "?n", where `n` is the index of the corresponding arg.
SQL doesn't use format strings, so the parameter must be
transformed at runtime using memory allocations.
And it does this without knowing whether the "%s" came from the
"format string" or from a parameter.
Not to mention the user can pass in other "format specifiers" at
will.
> 4. The replacement format string is bound to `statement`, and
> the arguments are bound
> to their indices.
Maybe. sqlite frowns upon mismatching arguments because the
library decided your search string was actually a placeholder in
some unrelated domain specific language (the language of
`writef`).
> 5. Then it executes the sql statement.
Maybe.
>
> It is equivalent.
It is most certainly not. The two are only slightly comparable.
IDIP is a mechanism for an SQL library author (and many other
domains, see Adam's repository) to effectively and gracefully
consume succinct and intuitive instructions from a user to avoid
SQL injections, and use the compiler to weed out problematic
calls.
Whereas DIP1027 is a loaded footgun which is built for `writef`
that can be shoehorned into an SQL lib, which necessitates
allocations and all checks are done at runtime.
-Steve
More information about the Digitalmars-d
mailing list