DIP 1026---Deprecate Context-Sensitive String Literals---Community Review Round 1

Tue Dec 3 18:34:22 UTC 2019

On Tue, Dec 03, 2019 at 07:38:29AM -0500, Andrei Alexandrescu via Digitalmars-d wrote:
> On 12/3/19 4:03 AM, Mike Parker wrote:
> > This is the feedback thread for the first round of Community Review
> > for DIP 1026, "Deprecate Context-Sensitive String Literals":
> > 
> > https://github.com/dlang/DIPs/blob/a7199bcec2ca39b74739b165fc7b97afff9e29d1/DIPs/DIP1026.md
> 
> This DIP is a non-starter. Here documents are easily and effectively
> handled during lexing and have no impact on the language grammar.
[...]

When I read the title "context-sensitive string literals" I was
wondering what part of D actually has strings whose interpretation
changes depending on context.  I was shocked to discover that it was
referring to heredoc strings.

Please don't get rid of heredoc strings. I use them quite a bit, because
I work a lot with code generators. They are a refreshing change from
C/C++ where trying to quote a piece of code as a string requires Leaning
Toothpick Syndrome (i.e., \'s all over the place to escape quoted string
metacharacters). I do *not* want to return to that nastiness, thank you
very much.

As Andrei said, heredoc string are trivial to parse because they are
essentially a single big token.  This should not pose any problem for
the parser at all.  The argument in the DIP is flawed because, at the
level of a lexer/parser, a heredoc string is no different from a
delimited string: it starts with a sequence of one or more characters
(the opening delimiter), spans some arbitrary number of characters (the
string content) until another sequence of one or more characters (the
closing delimiter).  Nothing stops someone from writing a
50,000-character double-quoted string, for example, and the lexer/parser
will handle it just fine.  So why the hate against heredoc strings?
Arguably, heredoc strings are exactly what *solves* the problem of
50,000-character strings being essentially unreadable to a human reader
because of poor formatting.

As for poor syntax highlighting as mentioned in the DIP, how is that
even a problem with the language?! It's a strawman argument based on
skewed data obtained from badly-written lexers that don't actually lex D
code correctly. It should be the syntax highlighter that should be
fixed, rather than deprecate an actually useful feature in the language.

Not to mention, the long list of projects at the end that will need to
be updated, which includes dmd itself BTW, looks like strong evidence of
good use of such string literals, rather than marginal use that might be
construed to be a reason for deprecation.

And most importantly of all: string literals are *single tokens* in the
language. They are lexical units, and therefore have nothing whatsoever
to do with the grammar being context-free or not.  We're shooting at the
wrong target here.

T

-- 
Famous last words: I wonder what will happen if I do *this*...