No household is perfect
H. S. Teoh
hsteoh at quickfur.ath.cx
Tue Dec 3 14:26:53 PST 2013
On Tue, Dec 03, 2013 at 09:19:34PM +0100, Brad Anderson wrote:
> On Tuesday, 3 December 2013 at 20:06:49 UTC, Walter Bright wrote:
> >On 12/3/2013 4:41 AM, Russel Winder wrote:
> >>Yes.
> >>
> >> a + b
> >>
> >>could be set union, logic and, string concatenation. The + is just a
> >>message to the LHS object, it determines what to do. This is the
> >>whole basis for DSLs.
Ugh. Ugh, ugh, ugh. This beckons to that horrid decision in C++'s
<iostream> of overloading << to mean "output" and >> to mean "input".
The only redeeming quality about this is that << and >> are relatively
rarely used in their original sense (bitwise shifts), so it doesn't
cause as much of a cognitive dissonance as it otherwise might. But
still. Ugh. There are just so many things wrong with this choice, not
the least of which is the fact that the operator precedence of << and >>
makes no sense when used as I/O operators -- because said operators were
never intended to be I/O in the first place!! This leads to such fun as:
int a, b;
cout << a < b; // what does this do?
// (hint: it does NOT output the value of a < b)
Ugh!
> >Using operator overloading to create a DSL is just wrong. Part of
> >the design of operator overloading in D is to deliberately
> >frustrate such attempts.
+1.
> >+ should mean addition, not union, concatenation, etc. Overloading
> >is there to support addition on user defined types, not to invent
> >new meanings for it.
There's a C++ library that overloads the *comma operator* (!!) to allow
you to do things like this:
// Creates a 3x4 matrix (!)
A = 1, 2, 3, 4,
5, 6, 7, 8,
9, 10, 11, 12;
Now, this particular example looks rather cute, but let's say we want to
compute matrix elements as we construct it:
// Creates a 3x4 matrix (what, really?!)
A = x++, y++, z++, f(x+y),
y+2*x-z, 4*y, 5*(z-y*x),
f(x)-f(y), f(z), g(x), 0;
Seriously?? Anyone who understands what a comma operator is (which is
itself already a Bad Idea) might imagine this is but a needlessly
obscure way of setting A to 0 while performing a whole bunch of
side-effects, in a way fitting for an IOCCC entry.
(And just in case you wonder: the dimensions of the matrix are
determined beforehand. So technically, you *could* create a 3x4 matrix
using this code:
// Yes this is still a 3x4 matrix... and yes the first row
// contains 1, 2, 3, 4, and the second row starts with 5.
// Obvious, isn't it?
A = 1, 2, 3,
4, 5, 6,
7, 8, 9,
10, 11, 12;
Or, indeed, this:
// This is a 3x4 matrix too, even though it sure doesn't look
// anything like it!!
A = 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12;
Please, somebody tell me how this can even remotely be construed to be a
good thing.)
Not to mention, the meaning of such code depends entirely up the type of
A. What if I have another custom type that also overloads the comma
operator, in a slightly different way? Then the semantics of the above
snippets would be *completely* different yet again.
Now tell me again, why is C++ code so hard to maintain? Hmmm...
> >Embedded DSLs should be visually distinct, and D provides the
> >ability for that with string mixins and CTFE.
String mixins + CTFE = teh r0ckz when it comes to DSLs.
After having experienced C++ for a decade or two, I've come to decide
that operator overloading is a Bad Idea(tm), except when it applies
strictly to custom numerical types that are intended to behave like
built-in numerical types. All other uses of operator overloading are,
strictly speaking, abusive, and lead to unmaintainable code. Yes, it's
cute and clever, and lets you write things not supported by the language
"directly", but the next person to inherit your code will curse your
name when they spend 5 hours trying to figure out exactly why x+y didn't
do what they thought it did. And that's just with *one* library that
overloads operators in an unusual way. Now add a second, third, fourth
library, each of which overloads the operators in an unusual way, and
you might as well be submitting your code as IOCCC entries (except that
they don't take C++ entries).
OTOH, I completely understand the desire for infix notation for
operators on custom types. If you're writing a set library, it sucks to
have to write a.union(b.intersection(c)) when what you *really* want is
to write: a ∪ (b ∩ c). Here is where D does it right: use a compile-time
string argument to a CTFE function that transforms this string into
code. Then you can write:
Set a, b, c;
auto d = mixin(SetExpr!"a ∪ (b ∩ c)");
// The above line gets turned into:
// auto d = a.union(b.intersection(c));
// at compile-time.
So you can write your set expressions the "natural" way, *and* a new
reader of your code will know to look for SetExpr's documentation to
understand what the string argument does (not to mention it being amply
clear that a DSL is involved here, rather than code that looks like
normal numerical expressions but actually does something else).
This has even more benefits than fixing C++'s wrong approach, though:
For one thing, overloaded operators can't easily generate optimal code,
because they just get translated into nested function calls. In order to
be able to optimize, say, a ∪ a ∪ a into a no-op, in C++'s approach
you'd have to resort to arcane black magic like expression templates to
coax the compiler to do what you want. In D, you are parsing the
expression as a *string*, which means you get to define how the string
is parsed, and how it is to be transformed into code, *directly*. You
can run the expression tree through an expression simplifier algorithm,
for example, factor common subexpressions, reduce it using known
identities, etc.. All of which, granted, can be done by expression
templates, except with many more times the pain, proneness to bugs, and
unmaintainability.
These string DSLs also let you define your own operators (like I did
above) without needing to abuse existing operators like + and *, define
your own operator precedence rules, define custom syntax without needing
to twist and warp it to conform to host language syntax (like that C++
regex library, which honestly makes me cringe every time I look at its
contorted syntax).
> >Part of my opinion for this comes from C++ regexes done using
> >expression templates. It's cute and clever, but it's madness. For
> >one, any sort of errors coming out of it if a mistake is made are
> >awesomely incomprehensible. For another, there's no clue in the
> >source code when one has slipped into DSL-land, and suddenly *
> >doesn't mean pointer dereference, it means "0 or more".
> >
> >Utter madness.
Yeah, that library, while admittedly very clever, is total madness. It
looks *nothing* like what regexen normally look like, does something
completely unlike what its surface syntax might suggest, and is in
pretty much every way very difficult to understand, and therefore hard
to maintain and prone to bugs. In today's software development world,
where there's too much code to comprehend and too little time to
comprehend it, dissociating syntax from its usual meaning is just asking
for maintenance nightmares.
> Indeed. I had a regex bottleneck in a C++ program so I figured I'd
> just convert it to Boost Xpressive as an easy solution. It took me
> half a day to convert the regular expression into the convoluted
> single line of code with dozens of operators it became. It did run
> faster (phew!) so it was worth it but the code is unrecognizable as
> a regular expression and I have to keep a comment with the original
> regular expression in the code because nobody (myself included)
> should have to spend an ungodly amount of time trying to decipher
> the cryptic source code it became.
>
> If my program were written in D I would have just replaced "regex("
> with "ctRegex!(" and moved on with my day.
Yeah!! Props to std.regex!
T
--
Why can't you just be a nonconformist like everyone else? -- YHL
More information about the Digitalmars-d
mailing list