More D newb questions.

Mon May 5 20:20:05 PDT 2008

Walter Bright wrote:

> Me Here wrote:
> > What elese could 'a' ~ 'b' mean other than char[] tmp = "ab"?
> 
> Let's generalize the issue to be what does:
> 
> 	T ~ T
> 
> mean for any type T? Your argument is that the result should be T[]:
> 
> 	T ~ T => T[]
> 
> Ok. So imagine now that T is S[]:
> 
>         S[] ~ S[] => S[][]
> 
> and we've done away with array concatenation. We could make a special case
> just for char values (and presumably all basic types), but then we'll find
> ourselves unable to write generic code.
> 
> 	T ~ T[] => T[]
> 	T[] ~ T => T[]
> 
> has no such ambiguity, and so is supported.

Sorry, but this is silly. 

When the compiler attempts to resolve T[] ~ T[] it succeeds because opCat is
defined for T[]. Likewise for T[] ~ T. For T ~ T[] opCat_r is defined for T[],
so it resolves.

All I'm suggesting is that if opCat is defined for T, then T ~ T can resolve
also.

That doesn't imply the current opCat defined for T[] would suddenly change to
T[] ~ T[] -> T[][]. Why would it? It's nonsensical, and it isn't going to
change unless someone changes it. What you are suggesting is that if I define
opMul for MyClass to do MyClass * MyClass -> MyClass[], then T * T would have
to render T[] for *all* types. But it doesn't work that way. Int * Int isn't
suddenly going to stop performing multiplication and start rendering an array
of Ints. Nor is Real * Real. unless someone goes in an changes the definition
of opMul for those other types. Overloading a arithmetic operator to perform
non-arithmetic operations is generally frowned upon, but is still quite common
with, for example matrix manipulations. And doing so doesn't suddenly change
the * works for any other type or class.

So why would defining opCat for the basic types, suddenly cause (or require)
that opCat's defined for composites to behave differently?

Had someone pointed out earlier that my immediate problem could be solved by
using:

	table[ 
	    b1 << 14 | b2 << 12 | b3 << 10 | b4 <<  8 | 
	    b5 <<  6 | b6 <<  4 | b7 <<  2 | b8 
	][] = [ 
	    abcd[ b1 ], abcd[ b2 ], abcd[ b3 ], abcd[ b4 ], 
	    abcd[ b5 ], abcd[ b6 ], abcd[ b7 ], abcd[ b8 ]
	]; (**)

We probably wouldn't be having this conversation. But that doesn't persuade me
that catenation of basic types wouldn't a) answer peoples expectations; b) or
break generics. That said, I'm not at all sure that string handling /should/ be
dealt with using generic array type manipulations. Insertions, deletions and
in-place mutations are very common operations on char[]s, but far less so on
int[] real[], creal[] etc. And for most object aggregations for which
insert/delete or slice operations would make sense, you'd probably use linked
lists. The overhead of (at least) a 32/64 bit pointer per char would be a
nonsense, so maybe strings should be a special case and not template generated
generalisation of array type.

If feel about this the same way I feel about what think I should be able to do
with lvalue slices, It something that I will have to accept because you're
calling the shots, but your reasoning escapes me.

For unequal sized or overlapping slice assignment, the detection code is a
little messy, and the implementation for the non-trivial case, non-trivial. But
that's exactly why it should be written once, got right by the expert(s), and
just used by everyone else. Moving the code out of the core into a library may
seem to satisfy that requirement, but it still means the Joe Mortal programmer
is left with having to write their own code to detect whether they can use a
slice assignment or must call std.string.replaceSlice()*.

The doc examples on opSliceAssignment aren't marvelously clear as they only
deal with the case of assigning a single value to a single element slice. For a
muli-element slice assignment, you need source reference, pos1, length1 &
destination ref, pos2, length2. Are the checks for the simple case really so
onorous?

	if( length1 == length2 
	and ( src != dst 
	       or ( pos2 + length2 < pos1 ) 
	       or ( pos1 + length1 < pos2 )
	) {
	    // do the simple case
	} else {

What does that equate to, maybe 8 or 10 opcodes? And half of those would be
register loads you'd have to execute to set up the REP MOVS instruction. The
extra instructions would be a couple of ADDs, TESTs and Jxx before falling
through to the REP MOVS. Half a dozen cycles penalty per conforming slice
assignment? There's an old saying something about spoiling the barrel for a
ha'peth of tar.

Again I apologise. I didn't set out to critique this stuff (again), just to
solve my immedaite problem--which I now have.
One good thing (for me) that's arisen from my attempting to understand your and
BCS' responses are that it has forced me to look much deeper into the
overloading facilties. And (at least in theory) I realise that with these, plus
a struct definition and a little in-line assembler I should be able to define
my own 'string' type that works the way I want it to...

Cheers, b.

(*BTW. Why does char[] replaceSlice(char[] string, char[] slice, char[]
replacement) require both the target string and target slice of that string?)
(**Shame about the need to repeat the abcd[ ] bit so much. Perl#s array slices
accept multiple individual indexes (as well as ranges). So the equivalent of
the above would be:

	@table[ ... ] = @abcd[ b1, b2, b3, b4, b5, b5, b7, n8 ];

which is a lot nicer to read. )

--