A Discussion of Tuple Syntax

Mon Aug 19 11:43:35 PDT 2013

On Monday, 19 August 2013 at 16:53:06 UTC, Wyatt wrote:
> To be clear, I'm not talking about braces, {}; I'm talking 
> about parentheses, ().  I read over that whole DIP32 thread a 
> couple times, and didn't see any rationale offered for why the 
> likely "cleanest" version "can't be used".  It wasn't even 
> brought up (unless I've missed something subtle).  In the 
> second thread, linked in the OP here, they were glossed over 
> again.  Now, I fully believe there's a very good reason that's 
> been written somewhere, but I _would_ like to know what that 
> is, preferably documented somewhere less ephemeral and 
> difficult to search than the newsgroup (such as in DIP32).  The 
> closest I've seen so far is the pull request where Walter and 
> Andrei expressed that it should be considered further.

I could very well be wrong, but I would bet that one of the 
reasons is that (a, b, c) expressions already have well-defined 
semantics in D (as well as (2, "a", func()). Example:

void main()
{
	import std.stdio;

         //Prints "a"
	writeln((true, false, "a"));
}

Making this a tuple literal would be a change in semantics, which 
I don't think would go over well and would break code. Another 
example:

void main()
{
	int a, b;
	(a, b) = (3, 4);
	assert(a == 0 && b == 4);
}

Of course, for the second case, Kenji's proposed syntax used 
"auto (a, b) = ...", which would disambiguate it, but it could 
confuse people as to whether the first syntax is somehow related 
to the second.

> The octothorpe _is_ much better than the t simply in terms of 
> readability, though, even more than q{} or t{}, I have concerns 
> about its ability to be found with an ordinary search engine by 
> an ordinary user.  Have you tried looking for documentation on 
> weird operators with a search engine lately?  They don't 
> exactly take to it well. :/ (cf. Perl's <=>)

I'm not sure how much of a problem that would be. There's only 
one other syntactic form that uses # in D, but you're right, it 
may cause some difficulty trying to search "d programming #".

> Addressing the other suggestion I saw that cropped up, I 
> personally find the two-character "bananas" to be impressively 
> ugly.  I considered suggesting some permutation on that same 
> idea, but after toying with a few examples I find it ends up 
> looking awful and I think it's honestly annoying to type them 
> in any form.  I even don't like how the unicode version of that 
> one looks; for doubling up, I think ⟦ ⟧ or ⟪ ⟫ or are easier on 
> the eyes.

My browser can't even display the second set of characters. D 
seems to have generally shied away from using any unicode 
operators (for a good reason. Who the hell has Σ on their 
keyboard?)

> I feel weird admitting this, but if we can't use some manner of 
> bare brace, I think I'd rather have tup(), tup[], tup{} (or 
> even tuple() et al) as a prefix over any single character.

It's not terrible, but it's rather wordy, especially if tuples 
begin to be used a lot in code.

> Can't make it a single underscore? Question mark works best 
> then, IMO.  It isn't as burdened with meanings elsewhere (sure 
> there's ternary and possibly-match in regex, but...have I 
> forgotten something?)

It *could* be an underscore; the only thing is that the 
underscore is a valid variable name, so the above expression 
would actually be binding two variables, which might surprise 
someone who was expecting otherwise. I don't really care all that 
much, but it's something to think about.

> #(a, ...) looks like to me like it would make a 2-tuple 
> containing a and a tuple of "everything else", because of the 
> ellipsis' use in templated code.  I think this is a little 
> unclear, so instead I'd prefer #(a, ? ...) (or whatever ends up 
> used for the discard character) to make it explicit.

To be clear, what I have in mind is that this would be "a, plus 
(none/one?) or more things that can either be elements or nested 
tuples". Then, in a construction such as #(head, rest...), rest 
would be exactly as you describe: a tuple consisting of 
everything after head. The semantics could get tricky, maybe this 
needs more thought.

> As a bonus, explicit discard means a simple comma omission is 
> less likely to completely change the meaning of the statement.  
> Compare:
> #(a, b, ...)   //bind the first two elements, discard the rest.
> #(a, b ...)    //bind the first element to a and everything 
> else to b
> #(a, b, ? ...) //same as the first
> #(a, b ? ...)  //syntax error
>
> Granted, there's this case:
> #(a, ?, ...)
> ...but that seems like it would be less common just based on 
> how people conventionally order their data structures.

That's true. Something to think about. Maybe combine the question 
mark and ellipsis like so:

#(a, b, ?..)

> Thought: Is there sufficient worth in having different tokens 
> for discarding a single element vs. a range? e.g.
> #(a, ?, c, * ...) //bind first and third elements; discard the 
> rest
> // I'm not attached to the asterisk there.
> // +, #, or @ would also make some amount of sense to me.

Not sure. I need to think about it.

>> - Concatenating tuples with ~. This is nice to have, but not 
>> particularly important.
>>
> What does concatenating a tuple actually do?  That is:
> auto a = #(1,2) ~ 3; //Result: a == #(1,2,3), right?
> auto b = a ~ #(4,5); //Is  b == #(1,2,3,#(4,5)) or is b == 
> #(1,2,3,4,5)?

I think it should work the same as with arrays. So:

auto a = #(1, 2) ~ 3; //Error: 3 is not a tuple
auto a = #(1, 2) ~ #(3); //Result: #(1, 2, 3), just like an array

auto b = a ~ #(4, 5); //Result: #(1, 2, 3, 4, 5). Again, like 
arrays.

I think keeping the same semantics as arrays would be the best 
way to do it. I think it nicely follows the principle of least 
astonishment. If you wanted to explicitly append a tuple and have 
it nested, you'd need to do:

auto b = a ~ #(#(4, 5));

Which is messy, but at least it's explicit about what is going on.

> Great! After this, let's fix properties. ;)

Oh boy, no need to start *another* flame war.