interpolation proposals and safety
Adam D Ruppe
destructionator at gmail.com
Sat Dec 23 23:33:31 UTC 2023
On Saturday, 23 December 2023 at 22:55:34 UTC, Bruce Carneal
wrote:
> Is it really easy, trivial even, to button things up with
> either proposal or is one easier to use correctly than the
> other?
1027 makes it possible to do some cases correctly, but difficult
to trust in the general case since it makes no attempt at type
safety and its string cannot differentiate between user-injected
strings and format string literals.
So, when you process a 1027 style format string, and see a %, was
that part of the string or was that injected by the compiler to
indicate a param placeholder? What if the user forgets to escape
something, or passes the wrong syntax as a custom specifier?
These are all unforced errors in the design of 1027, that led to
its DIP being rejected by community review.
On the other hand, 1036e corrects these flaws, while adding the
possibility for CTFE manipulation, aggregation, and verification
of all string literals passed.
I encourage everyone to look at the sample repository here:
https://github.com/adamdruppe/interpolation-examples/
Several of the use cases selected for that specifically
demonstrate how it gives the users the convenient syntax they
expect from string interpolation, yet actually lowers to the
correct semantics for each specialized problem domain.
Example #1, basics, shows how, when a string is the right thing
to do, it works quite easily for it.
Example #2, formatting, shows how format strings can be attached
and processed in library code, including compile-time
verification associated with the data types passed.
Example #3, printf, shows how you can adapt the advanced usage D
provides to be compatible with legacy functions in a
zero-runtime-cost manner.
Example #4, internationalization, builds off the techniques shown
in the previous examples to use the industry-standard GNU gettext
library, coupled with automatic aggregation of translatable
strings at compile time, to provide full context to
non-developers to add new language packs at run time.
The next three examples are directly relevant to your question,
and address common problems web developers face, where security
problems are often introduced where strings are convenient, but
no longer appropriate for correctness.
Example #5, urls, shows how you can build off the previously
demonstrated techniques, to make a directly-manipulable
high-level object out of what looks to be a simple, familiar
string. Since it works at a high level, aware of the surrounding
context, it ensures each injected component is encoded
appropriately for that context.
Example #6, sql, directly avoids the trap of sql injection by
separating code and data - delegating the recombination of them
to the database engine to do it safely and correctly, yet
appearing to the user to be a convenient mixture of the two!
Notice how the usage example, at the top level of the repository,
*looks like* string interpolation, yet the implementation, in the
`lib` folder, actually binds the data to a prepared statement in
a structured way, like the guides say you are supposed to!
Finally, example #7, directly avoids the trap of XSS holes by,
again, separating HTML structure from added data and ensuring
correct encodings and valid data positioning is done in all
contexts. With CTFE validation, it prevents common mistakes that
can manifest as bugs or exploitable holes in production, and by
working on a high level, using object representations instead of
raw strings, it ensures all semantic invariants are maintained
from creation to consumption. It goes beyond just bringing web
best practices to the D programming language - it also enables
innovation by allowing coupling of these security guidelines and
development best practices with D's unique features for static
analysis and compile-time processing.
Similar examples could be written for shell scripting, json, and
more, but I thought this was enough to make the point and
demonstrate the relevant patterns.
By the end of this year, when this new feature is merged, D will
cement its position as an innovating pioneer, learning the
lessons from the past and applying their best libraries in a
whole new way.
More information about the Digitalmars-d
mailing list