Non-nullable references, again

Fri Jan 2 05:00:38 PST 2009

Benji Smith wrote:
> Daniel Keep wrote:
>> Benji Smith wrote:
>>> Don wrote:
>>>> Denis Koroskin wrote:
>>>>> Foo nonNull = new Foo();
>>>>> Foo? possiblyNull = null;
>>>  >
>>>> Wouldn't this cause ambiguity with the "?:" operator?
>>>
>>> At first, thought you might be right, and that there would some 
>>> ambiguity calling constructors of nullable classes (especially given 
>>> optional parentheses).
>>>
>>> But for the life of me, I couldn't come up with a truly ambiguous 
>>> example, that couldn't be resolved with an extra token or two of 
>>> lookahead.
>>>
>>> The '?' nullable-type operator is only used  in type declarations, 
>>> not in expressions, and the '?:' operator always consumes a few 
>>> trailing expressions.
>>>
>>> Also (at least in C#) the null-coalesce operator (which converts 
>>> nullable objects to either a non-null instance or a default value) 
>>> looks like this:
>>>
>>>   MyClass? myNullableObj = getNullableFromSomewhere();
>>>   MyClass myNonNullObj = myNullableObj ?? DEFAULT_VALUE;
>>>
>>> Since the double-hook is a single token, it's also unambiguous to parse.
>>>
>>> --benji
>>
>> Disclaimer: I'm not an expert on compilers.  Plus, I just got up.  :P
>>
>> The key is that the parser has to know what "MyClass" means before it 
>> can figure out what the "?" is for; that's why it's context-dependant. 
>> D avoids this dependency between compilation stages, because it 
>> complicates the compiler.  When the parser sees "MyClass", it *doesn't 
>> know* that it's a type, so it can't distinguish between a nullable 
>> type and an invalid ?: expression.
>>
>> At least, I think that's how it works; someone feel free to correct me 
>> if it's not.  :P
>>
>>   -- Daniel
> 
> I could be wrong too. I've done a fair bit of this stuff, but I'm no 
> expert either :)
> 
> Nevertheless, I still don't think there's any ambiguity, as long as the 
> parser can perform syntactic lookahead predicates. The grammar would 
> look something like this:
> 
> DECLARATION :=
>   IDENTIFIER         // Type name
>   ( HOOK )?          // Is nullable?
>   IDENTIFIER         // Var name
>   (
>     SEMICOLON        // End of declaration
>     |
>     (
>       OP_ASSIGN      // Assignment operator
>       EXPRESSION     // Assigned value
>     )
>   )
> 
> Whereas the ternary expression grammar would look something like this:
> 
> TERNARY_EXPRESSION :=
>   IDENTIFIER         // Type name
>   HOOK               // Start of '?:' operator
>   EXPRESSION         // Value if true
>   COLON              // End of '?:' operator
>   EXPRESSION         // Value if false
> 
> The only potential ambiguity arises because the "value if true" 
> expression could also just be an identifier. But if the parser can 
> construct syntactic predicates to perform LL(k) lookahead with arbitrary 
> k, then it can just keep consuming tokens until it finds either a 
> SEMICOLON, an OP_ASSIGN, or a COLON (potentially, recursively, if it 
> encounters another identifier and hook within the expression).
> 
> Still, though, once it finds one of those tokens, the syntax has been 
> successfully disambiguated, without resorting to a semantic predicate.
> 
> It requires arbitrary lookahead, but it can be done within a 
> context-free grammar, and all within the syntax-processing portion of 
> the parser.
> 
> Of course, I could be completely wrong too :)
> 
> --benji

case a?.b:c:
   break;

is this

   case ((a?).b):
c:
   break;

or is it

case (a ? b : c ) :
break;