[dox] Fixing the lexical rule for BinaryInteger
H. S. Teoh
hsteoh at quickfur.ath.cx
Fri Aug 16 17:50:24 PDT 2013
On Sat, Aug 17, 2013 at 01:03:35AM +0200, Brian Schott wrote:
> On Friday, 16 August 2013 at 22:43:13 UTC, Andre Artus wrote:
[...]
> >2. Your BinaryInteger and HexadecimalInteger only allow for one of
> >the following (reduced) cases:
> >
> >0b1__ : works
> >0b_1_ : fails
> >0b__1 : fails
>
> It's my opinion that the compiler should reject all of these because
> I think of the underscore as a separator between digits, but I'm
> constantly fighting the "spec, dmd, and idiom all disagree" issue.
[...]
I remember reading this part of the spec on dlang.org, and I wonder if
it was worded the way it is just for simplicity, because to specify
something like "_ must appear between digits" involves some complicated
BNF rules, which maybe seems like overkill for a single literal.
But sometimes it is good to be precise, if we want to enforce "proper"
conventions for underscores:
<binaryLiteral> ::= "0b" <binaryDigits> <underscoreBinaryDigits>
<binaryDigits> ::= <binaryDigit> <binaryDigits>
| <binaryDigit>
<underscoreBinaryDigits> ::= ""
| "_" <binaryDigits>
| "_" <binaryDigits> <underscoreBinaryDigits>
<binaryDigit> ::= "0"
| "1"
This BNF spec forces "_" to only appear between two binary digits, and
never more than a single _ in a row. You can also make your parser only
pick up <binaryDigit> when performing semantic on binary literals, so
the other stuff is ignored and only serves to enforce syntax.
I'd be surprised if there's any D code out there that doesn't fit this
spec, to be honest.
But if you want to accept "strange" literals like 0b__1__, you could do
something like:
<binaryLiteral> ::= "0b" <underscoreBinaryDigits> <binaryDigit> <underscoreBinaryDigits>
<underscoreBinaryDigits> ::= "_"
| "_" <underscoreBinaryDigits>
| <binaryDigit>
| <binaryDigit> <underscoreBinaryDigits>
| ""
<binaryDigit> ::= "0"
| "1"
The odd form of the rule for <binaryLiteral> is to ensure that there's
at least one binary digit in the string, whereas
<underscoreBinaryDigits> is just a wildcard anything-goes rule that
takes any combination of 0, 1, and _, including the empty string.
T
--
That's not a bug; that's a feature!
More information about the Digitalmars-d
mailing list