Indentation-aware multi-line string literals (and/or an equivalent compile-time function)

WraithGlade wraithglade at protonmail.com
Wed Feb 26 23:18:10 UTC 2025


Hello good people of the D forum!

There's an idea I've long wished was available in programming 
languages (off and on for many years whenever it occurs to me) 
that would likely be very simple to implement and yet also very 
useful in many contexts.

I'm not even aware of any programming language that has this 
feature, despite how simple and widely useful it would be, and so 
I think this is a great opportunity for the improving the D 
language!

Basically, the idea is that there should be a variant of raw 
("WYSIWYG") strings (a.k.a. multi-line strings when newlines are 
contained within the string) which is aware of the indentation 
level of the code it is being used in and compensates accordingly 
so that the programmer does not have to write the text in a way 
that does not *visually* respect the current indentation level of 
the code.

I think perhaps it should also remove leading and trailing 
newlines (but not internal newlines), such that it is also useful 
for cleanly writing larger bodies of text into code in a way that 
doesn't look crammed in vertically either.

Here is a comparative example of one possible syntax for such:

```
     //Traditional multi-line string syntax (ugly, jarring):
     string s1 = `Line 1
Line 2
Line 3`;

     //Indentation-aware multi-line string syntax (clean):
     string s2 = ``

     Line 1
     Line 2
     Line 3

     ``;

     static assert (s1 == s2);
```

Another possible implementation that occurred to me would be that 
a compile-time-usable string function could be added that when 
appended to a string would accomplish the same effect as the 
above without any runtime overhead. That case could look like so:

```
     string s = `

     Line 1
     Line 2
     Line 3

     `.unindent;
```

(or something like that)

I'm not certain which approach would be better, but I am certain 
that it would be a widely useful feature to have included with 
D's standard library and documentation since it is such a 
desirable and common use case.

**Ideas for implementation and possible nuances:**

The algorithm for interpreting the indentation of such 
indentation-aware string literals could work by scanning 
backwards from the start of the opening delimiter of the string 
to find the first non-whitespace character of that line and then 
use that to determine the what the current indentation level is.

If the indentation level is ambiguous due to the presence of a 
mix of spaces and tabs then the compiler can simply report an 
error and refuse to compile that literal until it is made 
unambiguous through the absence of mixed spaces and tabs.

Removing all leading and trailing newlines (except the natural 
ending newline of the last line, and not deleting any *internal* 
newlines) would make it easier to ensure that large bodies of 
text remain readable. The ability to use extra whitespace can 
help in such cases and what the right amount is could vary a bit 
potentially.

Alternatively, perhaps only the 1st or 2nd leading and/or 
trailing newlines could be removed, in order to enforce a 
standard amount of newlines for the included text body.

Another idea is that the literal and/or compile-time function 
could be parameterized so that whether or not to trim/strip the 
leading and/or trailing newlines (and/or other things) could be 
specified.

**A few use case examples:**

- Using D as an ad-hoc text templating system for markup 
languages such as HTML and such, without the resulting inline 
text in the generating D code looking ugly.
- Code generation for other programming languages (similar to the 
above item) and any related compiled and interpreted uses where D 
acts as a generator, keeping the text cleaner.
- Handling moderate to large bodies of text, such as can be found 
in many terminal-based programs and/or hobbyist video games (e.g. 
roguelikes) and many other general application contexts in a form 
that is clean enough that it is no longer so often necessary to 
maintain separate text files that have to be loaded as files.

This could be very useful I think, even though it seems so simple.

Seemingly trivial workflow factors like this can have a much 
bigger effect on what one uses something for than one may expect.

**My own current use case context:**

For example, I myself am planning on eventually converting my 
personal website (which is 100% static and currently uses only 
straight HTML & CSS to avoid computational waste and arbitrary 
formatting restrictions) to be generated from D source files 
instead of working directly with HTML files and all their myriad 
limitations and oddities.

I searched for "static site generators" and "web templating 
systems" for that use case but was very put off by the fact that 
they were nearly always over-engineered, riddled with 
dependencies, vendor-locked, and/or made lots of rigid 
assumptions about the format and contents of the pages and 
directories they generate. In contrast, a D-based system for 
generating a static web page would be far cleaner since it would 
allow completely arbitrary computational generality instead of 
falling victim to the "inner platform effect" anti-pattern and 
such.

That (using D to generate the site and any arbitrary other files 
I want) is what I plan to do regardless of whether this feature 
makes it in, but I was reminded of my long-time desired feature 
of indentation-aware strings in a programming language when I 
realized that the only real shortcoming (in my mind, for what I 
want) for the generation of my site from D code is that 
multi-line strings would look ugly. I will workaround that in the 
meantime when the time comes.

It's true of course that I could import and/or load strings from 
files separately, but that is often (for many cases at least) not 
as pleasant as having the strings and their usage context 
*directly available in the code right alongside their context*, 
which would be made much cleaner by having indentation-aware 
multi-line strings of any possible length.

I suspect many other people would get good use out of it too.

The fact that I've never seen the feature in any other language 
despite it being so obviously useful would also be a good 
opportunity for D to claim first (or at least early or uncommon) 
dibs on the feature's presence potentially!

Indentation-aware multi-line strings would also be a very natural 
fit for D's already strong support for cleanly allowing for 
arbitrary nesting of structures, such as its ability to put an 
`import` statement nested at any point inside functions or code 
blocks. Thus, the idea is also very naturally "D-like" in that 
respect I think.

It would be really useful to have that built in to the language!

What do you guys and gals think?


More information about the dip.ideas mailing list