interpolation proposals and safety

Adam D Ruppe destructionator at gmail.com
Sat Dec 23 23:33:31 UTC 2023


On Saturday, 23 December 2023 at 22:55:34 UTC, Bruce Carneal 
wrote:
> Is it really easy, trivial even, to button things up with 
> either proposal or is one easier to use correctly than the 
> other?

1027 makes it possible to do some cases correctly, but difficult 
to trust in the general case since it makes no attempt at type 
safety and its string cannot differentiate between user-injected 
strings and format string literals.

So, when you process a 1027 style format string, and see a %, was 
that part of the string or was that injected by the compiler to 
indicate a param placeholder? What if the user forgets to escape 
something, or passes the wrong syntax as a custom specifier? 
These are all unforced errors in the design of 1027, that led to 
its DIP being rejected by community review.

On the other hand, 1036e corrects these flaws, while adding the 
possibility for CTFE manipulation, aggregation, and verification 
of all string literals passed.

I encourage everyone to look at the sample repository here:

https://github.com/adamdruppe/interpolation-examples/

Several of the use cases selected for that specifically 
demonstrate how it gives the users the convenient syntax they 
expect from string interpolation, yet actually lowers to the 
correct semantics for each specialized problem domain.

Example #1, basics, shows how, when a string is the right thing 
to do, it works quite easily for it.

Example #2, formatting, shows how format strings can be attached 
and processed in library code, including compile-time 
verification associated with the data types passed.

Example #3, printf, shows how you can adapt the advanced usage D 
provides to be compatible with legacy functions in a 
zero-runtime-cost manner.

Example #4, internationalization, builds off the techniques shown 
in the previous examples to use the industry-standard GNU gettext 
library, coupled with automatic aggregation of translatable 
strings at compile time, to provide full context to 
non-developers to add new language packs at run time.

The next three examples are directly relevant to your question, 
and address common problems web developers face, where security 
problems are often introduced where strings are convenient, but 
no longer appropriate for correctness.

Example #5, urls, shows how you can build off the previously 
demonstrated techniques, to make a directly-manipulable 
high-level object out of what looks to be a simple, familiar 
string. Since it works at a high level, aware of the surrounding 
context, it ensures each injected component is encoded 
appropriately for that context.

Example #6, sql, directly avoids the trap of sql injection by 
separating code and data - delegating the recombination of them 
to the database engine to do it safely and correctly, yet 
appearing to the user to be a convenient mixture of the two! 
Notice how the usage example, at the top level of the repository, 
*looks like* string interpolation, yet the implementation, in the 
`lib` folder, actually binds the data to a prepared statement in 
a structured way, like the guides say you are supposed to!

Finally, example #7, directly avoids the trap of XSS holes by, 
again, separating HTML structure from added data and ensuring 
correct encodings and valid data positioning is done in all 
contexts. With CTFE validation, it prevents common mistakes that 
can manifest as bugs or exploitable holes in production, and by 
working on a high level, using object representations instead of 
raw strings, it ensures all semantic invariants are maintained 
from creation to consumption. It goes beyond just bringing web 
best practices to the D programming language - it also enables 
innovation by allowing coupling of these security guidelines and 
development best practices with D's unique features for static 
analysis and compile-time processing.


Similar examples could be written for shell scripting, json, and 
more, but I thought this was enough to make the point and 
demonstrate the relevant patterns.

By the end of this year, when this new feature is merged, D will 
cement its position as an innovating pioneer, learning the 
lessons from the past and applying their best libraries in a 
whole new way.


More information about the Digitalmars-d mailing list