std.d.lexer requirements
Jonathan M Davis
jmdavisProg at gmx.com
Thu Aug 2 19:35:07 PDT 2012
On Thursday, August 02, 2012 19:52:35 Jonathan M Davis wrote:
> I suppose that we could make it operate on code units and just let ranges of
> dchar have UTF-32 as their code unit (since dchar is both a code unit and a
> code point), then ranges of dchar will still work but ranges of char and
> wchar will _also_ work. Hmmm. As I said, I'll have to think this through a
> bit.
LOL. It looks like taking this approach results in almost identical code to
what I've been doing. The main difference is that if you're dealing with a
range other than a string, you need to use decode instead of front, which
means that decode is going to need to work with more than just strings
(probably stride too). I'll have to create a pull request for that.
But unless you restrict it to strings and ranges of code units which are
random access, you still have to worry about stuff like using range[0] vs
range.front depending on the type, so my mixin approach is still applicable,
and it makes it very easy to switch what I'm doing, since there are very few
lines that need to be tweaked.
So, I guess that I'll be taking the approach of taking ranges of char, wchar,
and dchar and treat them all as ranges of code units. So, it'll work with
everything that it worked with before but will now also work with ranges of
char and wchar. There's still a performance hit if you do something like
passing it filter!"true(source), but there's no way to fix that without
disallowing dchar ranges entirely, which would be unnecessarily restrictive.
Once you allow arbitrary ranges of char rather than requiring strings, the
extra code required to allow ranges of wchar and dchar is trivial. It's stuff
like worrying about range[0] vs range.front which complicates things (even if
front is a code unit rather than a code point), and using string mixins makes
it so that the code with the logic is just as simple as it would be with
strings. So, I think that I can continue almost exactly as I have been and
still achieve what Walter wants. The main issue that I have (beyond finishing
what I haven't gotten to yet) is changing how I handle errors and comments,
since I currently have them as tokens, but that shouldn't be terribly hard to
fix.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list