D needs a type expression syntax
Quirin Schroll
qs.il.paperinik at gmail.com
Wed May 17 15:44:09 UTC 2023
*Note: The `opt`s made the grammar in quotations harder to read
than necessary and aren’t essential to the quote, so I removed
them.*
On Saturday, 13 May 2023 at 11:13:59 UTC, Nick Treleaven wrote:
>
> Just noticed we have this form for function literals:
> ```
> function RefOrAutoRef Type ParameterWithAttributes
> FunctionLiteralBody2
> ```
> https://dlang.org/spec/expression.html#function_literals
>
> But for function types we have just:
> ```
> TypeCtors BasicType function Parameters FunctionAttributes
> ```
> https://dlang.org/spec/type.html (inlining the *TypeSuffix*
> form)
>
> It seems we could add this form to *Type* for consistency with
> function literals:
> ```
> function RefOrAutoRef Type ParameterWithAttributes
> ```
Technically, that would work. I disagree with “for consistency,”
however. The current consistency is that the order of keyword and
type make it a type or a literal. Optional elements aside, a
function literal looks like this:
```d
function int(int x) @safe => x
```
and its type is:
```d
int function(int x) @safe
```
You see, if the keyword (`function` or `delegate`) is first and
the type is second, it’s a literal (an expression); if the type
is first and the keyword is second, it’s a type. It doesn’t take
long to understand this duality, even if it’s not pointed out
anywhere directly. I find this is really beautiful; it’s an
elegant solution and easy to read. It’s a bad fortune of miracle
proportions that C and C++ have a function pointer syntax (for
C++ also member function pointer syntax) that is much harder to
come up with _and_ a lot worse to read.
The problem D has is that while
```d
function ref int(ref int x) @safe => x
```
is valid syntax for the literal, the corresponding
```d
ref int function(ref int x) @safe;
```
is not a valid type – except in an `alias` declaration, where it
is. This is the heart of the issue.
If I understand you correctly, you’d also allow
```d
function int(int x) @safe
```
as a type, that is, when the function doesn’t return by
reference. (It would be really inconsistent if that wasn’t
allowed.)
The distinguishing factor then is what follows this whole long
sequence of tokes. If it’s a brace, `do` or `=>`, then it’s an
object, otherwise it’s a type, meaning that programmers (and
likewise the parser) have to read it to the end to know if this
is a type or an expression.
> So this works:
> `void f(function ref int() g);`
But for the type of a function returning by value, you’d then
have two syntaxes, right?
```d
function int()
int function()
```
I don’t think this would be a great thing.
> […]
> That would be a smaller impact change than parenthesized types.
Depends on how you measure impact or what you consider small:
* It requires a larger grammar change.
* Thus, it probably requires more code in the parser.
* It solves a very specific problem and introduces niche syntax.
* It breaks with an existing principle.
Considering that `ref int function()` already kind of is a type,
namely in `alias` declarations, it doesn’t extend the trajectory
of the current syntax, but goes in an entirely different
direction.
The smaller-impact solution is to make `ref int function()` a
first-class type; practically, this isn’t enough because there
are token sequences that could in principle be read two ways:
`ref` being a the storage class of a function pointer parameter
or indicating a function pointer parameter that returns by
reference. These ambiguities are generally disambiguated by “max
munch”. As in a lot of other cases, parentheses can disambiguate
in the alternative direction. To be syntactically able to do so,
types need primary expression syntax, which they even almost have.
If the syntax like `(const int)` enabled by full-on type
expression syntax are deemed a problem – which would be weird
because e.g. `inout(const int)` is already allowed –, to still
solve the `ref` returning function problem, instead of
[`BasicType`](https://dlang.org/spec/type.html#BasicType) →
[`TypeCtor`](https://dlang.org/spec/type.html#TypeCtor)?
**`(`**[`Type`](https://dlang.org/spec/type.html#Type)**`)`**
we can still do
[`BasicType`](https://dlang.org/spec/type.html#BasicType) →
[`TypeCtor`](https://dlang.org/spec/type.html#TypeCtor)?
**`(``ref`**
[`TypeCtor`](https://dlang.org/spec/type.html#TypeCtor)?
[`BasicType`](https://dlang.org/spec/type.html#BasicType)
**`function`**
[`Parameters`](https://dlang.org/spec/function.html#Parameters)
[`FunctionAttributes`](https://dlang.org/spec/function.html#FunctionAttributes)**`)`**
[`BasicType`](https://dlang.org/spec/type.html#BasicType) →
[`TypeCtor`](https://dlang.org/spec/type.html#TypeCtor)?
**`(``ref`**
[`TypeCtor`](https://dlang.org/spec/type.html#TypeCtor)?
[`BasicType`](https://dlang.org/spec/type.html#BasicType)
**`delegate`**
[`Parameters`](https://dlang.org/spec/function.html#Parameters)
[`MemberFunctionAttributes`](https://dlang.org/spec/function.html#MemberFunctionAttributes)**`)`**
I really don’t like this because the `ref` and the parentheses
then require each other. As I stated earlier, it was as if `(A *
B) + C` were invalid because the parentheses are redundant. As an
example of the opposite approach, [Pony
Lang](https://tutorial.ponylang.io/expressions/ops.html#precedence) has no operator precedence whatsoever, i.e. requires `(A * B) + C`.
More information about the Digitalmars-d
mailing list