Three Unlikely Successful Features of D

Tue Mar 20 21:47:00 PDT 2012

On Tue, Mar 20, 2012 at 09:59:27PM -0400, Nick Sabalausky wrote:
> "H. S. Teoh" <hsteoh at quickfur.ath.cx> wrote in message 
> news:mailman.933.1332286692.4860.digitalmars-d at puremagic.com...
> > On Tue, Mar 20, 2012 at 06:58:31PM -0400, Nick Sabalausky wrote:
[...]
> >> - Built-in associative arrays that support nearly any type as a key
> >
> > This is actually quite buggy right now... but that's merely an
> > implementation issue. :-)
> 
> Heck, I just love that I can use a string, or an int, or make my own
> struct work as a key, etc. Actually, over just the last 24 hours I've
> been making a lot of use of AAs with int keys. (AA's make it *so* much
> easier to avoid poor time complexity in a lot of things.)

Yeah, AA's with int keys are like arrays enhanced with O(1)
insertion/removal and sparse storage (if you have very large indices,
e.g.). :-) You can even have (pseudo) linear access if you iterate keys
from 0 to $.

> Many langauges (like Haxe, for example) will have hashtables, and may
> even have them templated (or otherwise generic) on *value*, but the
> keys will be string-only. Which is still very useful, but it also
> misses out on many other use-cases.

Yep.

> > My new AA implementation, for example, already correctly supports
> > AA's with AA keys, which can be arbitrarily nested. So you could
> > have something like int[string[char[byte]]], and it does lookups
> > correctly based on the contents of the AA's you pass in as key.
> >
> 
> Crazy stuff :)
> 
> Actually I've been meaning to ask what the main benefits of your new
> AA implementation are. I know there's the benefit of just simply
> having it be implemented in the library. And you mention using AA's as
> AA keys here. Are there any other, umm, "key" points?

Haha, I love that pun.

I would say the main benefit is having it implemented in the library,
because that allows the implementation have direct access to key/value
types. I didn't implement any clever new hashing algorithm at all, I
just mainly followed the implementation in aaA.d. But having direct
access to key/value types is a *huge* win.

For example, it lets you accept keys/values that are not strictly the
AA's key/value type, but can be implicitly converted to them. It lets
you return keys and values without needing the ugly typeinfo and void*
casts that are necessary in aaA.d.  This in turn lets you mark many AA
methods as pure, and almost all as @safe or @trusted. It lets you
cleanly interoperate with types that define opAssign (currently aaA.d
does a blind binary copy of data from key/value pointers, which leads to
potential bugs when the data has subobjects.)

It also makes it *much* easier to fix many existing AA bugs in the
bugtracker. So far, I have working unittests for the following issues:
3824, 3825, 4337, 4463, 5685, 6210, 7512, 7512, 7602, 7632, 7665, 7665,
7704. I haven't looked through all AA-related issues yet; this list may
very well grow. :-) To fix these in the current aaA.d implementation can
be rather tricky, and quite possibly requires compiler changes.

Better yet, I thought of a way of making AA's instantiable at
compile-time via CTFE and mixins: this will let you write AA literals
that can be evaluated at compile-time and have them turn into object
code directly without needing runtime initialization.

Another major benefit is that once the AA implementation is decoupled
from the compiler (the compiler will only provide syntactic sugar like
V[K] and literal syntax), we can finally fix AA bugs without getting
roadblocked by compiler limitations or hard-to-repair compiler design
flaws (or dirty hacks in the compiler that were introduced to paper over
the fundamental schizophrenic problem of struct AssociativeArray being
distinct yet not different from the AA implementation in aaA.d). Things
are MUCH easier to fix when you can directly access key/value types
instead of having only TypeInfo's and needing to resort to void*
casting.

[...]
> > C++ ctors are a royal pain in the neck. [...]
> > I ended up using just stub ctors for a lot of my code, and doing the
> > actual initialization after the object is constructed. Which is very
> > bad OO style, I agree, but the pain of working with C++ ctors just
> > pushes me in the wrong direction, y'know?
> >
> 
> Yea, that's what I've been planning on doing with the C++ stuff I have
> coming up. Don't even want to bother with C++'s ctor limitations. Just
> make an init() member and be done with it. Actually, that seems to be
> turning into more and more of a common C++ idiom though, from what
> (little) I've seen.

The fact that C++ is pushing people in that direction is scary. It
breaks one of the principles of good object design, which is that all
sequences of method calls should never cause it to enter into an
inconsistent state. This no longer holds if init() has to be called
first. This is a sign that something is fundamentally wrong with the
language.

> >> [...] (And even finally: I head somewhere C++ doesn't even have
> >> finally: Is that true?!?)
> >
> > Yes, it's true. I don't know about C++11, but certainly the previous
> > standard has no finally clause, leading to horribly unmaintainable
> > and ugly code like:
> >
> > Resource r = acquireResource();
> > try {
> > doSomethingDangerous();
> > } catch(...) {
> > r.release();
> > }
> > r.release();
> >
> 
> Haxe also lacks finally! Which I always found rediculous. So yea, I'm 
> intimately familiar with that idiom. I've used it myself far more than I 
> would like.

And actually there's already a bug in the code I wrote: I forgot the
rethrow inside the catch(...). See, that's how bug-prone this stupid
idiom is.

> And even *that* still doesn't work if you don't catch *every*
> exception (and then rethrow the ones you don't care about? Ick!).

Actually, you can catch "..." and it will catch *everything*. And I
believe a single "throw;" will rethrow whatever it is you caught.

> I've seen C++ programmers swear off exceptions because of this, and I
> can't blame them at all.  Exception systems *need* a finally.

Yeah. "catch(...)" sorta works, but it's very ugly. And while being able
to throw *anything* at all is nice (I'm guilty of writing code that
throws char*, for example), not being able to make *any* assumptions at
all about what you caught (e.g., no common exception superclass with
some useful methods, like .msg) is, shall we say, practically useless in
a large enough project?

(Actually, the lack of .msg was what drove me to throw char*. Obviously
checking return codes for every lousy function I call is out of the
question, but so is throwing error codes that come from different
subsystems, since you've no way of telling which error code scheme to
use to look up the error. So I said to myself, why not throw a string
that actually tells you what the error is? Furthermore, if these strings
were predefined in char arrays that had unique pointer addresses, the
value of the pointer itself serves as a kind of "global error number".
So this worked as a kind of a poor man's error code + message exception
that can be freely thrown around without problems --- the reason I shied
away from throwing class objects in the first place was because early
implementations of C++ had problems that sometimes caused pointer bugs
and all kinds of nasty side effects when a class object is thrown. Like
if you forget to define a copy ctor for your exception class, and you
forget to catch by reference instead of by value... With a char*, there
can be no such problem unless the compiler is completely unusable.)

[...]
> > For all the warts the current GC has, the fact that D has a GC at
> > all makes things like array slicing possible, and *fast*, which
> > leads to all the other niceties of slicing.
> >
> 
> I used to do indie game dev in C/C++ and I feel downright spoiled now
> with tossing in a "new" whenever appropriate and not have to worry
> about cleanup (and even that wouldn't be *too* bad...in certain
> cases...if there were at least scope guards).

Scope guards rule. Ironically, D's GC mostly alleviates the need for
scope guards. :-) They're still immensely useful when you acquire
resources that must be cleaned up no matter what happens later. D is the
first and only language I know that got resource cleanup done right.
Cleanups belong with the acquisition code, not dangling somewhere 200
lines down at the end of the scope, with who knows how many possible
leaks in between due to goto's, exceptions, returns, and who knows what!

[...]
> > Definitely. Using alias and static if in a recursive template is one
> > of the hallmarks of the awesomeness of D templates.
> >
> 
> I'd say one of the hallmarks of D's metaprogramming is the enormous
> *decrease* in the need for recursive templates in the first place ;)

CTFE even makes it possible to express what many recursive templates
express, in pure imperative style. I mean, you can't get any better than
this:

	int factorial(int n) {
		int result = 1;
		while (n>1) {
			result *= n;
			result--;
		}
		return result;
	}
	enum x = factorial(12); // compile-time computation
	int y = factorial(12);	// runtime computation

In C++, you'd have to use recursive templates that are extremely
difficult to read and write beyond the simplest of functions.

> With C++'s templates, it would appear that you have to use recursion
> and helper templates for damn near anything.
[...]

Not to mention the horrible, horrible, syntax that comes with recursive
templates. My previous manager used to tell me that as soon as he sees
nested templates deeper than 2 levels, his eyes start glazing over, and
it all becomes just arcane black magic.

T

-- 
Which is worse: ignorance or apathy? Who knows? Who cares? -- Erich Schubert