Interpolated strings

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Thu Apr 20 13:43:35 PDT 2017


On Thu, Apr 20, 2017 at 03:32:18PM -0400, Nick Sabalausky (Abscissa) via Digitalmars-d wrote:
[...]
> IMO, the only time a format string should be used instead of
> std.conv.text() or interpolated strings is when:
> 
> 1. You're just rendering *one* value at a time with non-standard
> formatting options (ie, left-/right-justified, leading/trailing
> zeroes, etc). (Speaking of which, `interp` could really use some
> formatting features so this could be avoided, and for performance
> reasons.)
> 
> 2. You need to support custom formatting specified at runtime (ex:
> software that supports displaying date/time in custom user-defined
> formats) but want to be lazy about it and not bother finding/writing a
> more user-friendly formatting syntax than printf-style (ie, extremely
> rare).

Hmm. I wonder if this is a matter of habituation and the kind of use
cases you most commonly encounter. Having programmed in Perl extensively
as well as in C/C++/D, I've dealt with both kinds of syntaxes, and I
find that each has its own niche where it does best, while for use cases
outside that niche it still works but not as well as the other syntax.

For example, if you are printing lots and lots of text with only the
occasional variable, the interpolated syntax is far more readable, e.g.:

	#!/usr/bin/perl
	print <<END
	Dear $title $name,

	This is a spam email sent by $companyName corporation on behalf
	of $evilAdvertisingCompany to solicit for a donation of
	$dollarAmount to the lobbying against anti-spam bills proposed
	by the government of $country.

	Yours truly,
	$spammerName
	END;

is far more readable (and maintainable!) than:

	writefln(q"END
	Dear %s %s,

	This is a spam email sent by %s corporation on behalf
	of %s to solicit for a donation of
	$%d to the lobbying against anti-spam bills proposed
	by the government of %s.

	Yours truly,
	%s
	END", title, name, companyName, evilAdvertisingCompany,
	dollarAmount, country, spammerName);

Much of this is due to the out-of-band issue you mentioned. Somebody
could easily write the arguments in the wrong order, or substitute a
variable with another one not meant to be formatted, and it would be
difficult to notice the mistake just by looking at the code.

But if you're printing lots of variables according to a precise template
(e.g., rows of a table or a list of fields), format strings make more
sense, e.g.:

	foreach (rec; records) {
		writefln("[%8d] %20s  %10.3f", rec.id, rec.name, rec.amount);
		writefln("      %20s  %10s", rec.altName, rec.comment);
		writefln("      %20s  %6s", rec.address, rec.postalCode);
	}

The advantage here is that you separate formatting from content,
ostensibly a good thing depending on which circles you hang out in.

And you can't beat this one with interpolated strings:

	auto matrix = [
		[ 1, 2, 3 ],
		[ 4, 5, 6 ],
		[ 7, 8, 9 ]
	];
	writefln("Matrix:\n%([ %(%3d, %) ]%|\n%)", matrix);

Output:

	Matrix:
	[   1,   2,   3 ]
	[   4,   5,   6 ]
	[   7,   8,   9 ]

If you're doing internationalization, though, neither option is a good
one (I gave an example using dates in another post): printf-style
formats have ordering issues (is it year first, then month, then day? Or
month first then day then year? Which argument is which?), and
interpolated strings have the problem of exposing variable names to the
translators (who are probably non-coders), potentially opening up the
possibility of arbitrary code execution via l10n strings. In this case,
it would seem best to have named arguments with format strings.

Between these textbook cases, though, is plenty of gray areas where it's
debatable whether one syntax is clearly superior over the other(s). And
here, factors of what you're used to, the kind of output you usually
need to produce, etc., all come into play and there doesn't seem to be a
clear one-size-fits-all.


T

-- 
Why ask rhetorical questions? -- JC


More information about the Digitalmars-d mailing list