Request for comments: std.d.lexer

deadalnix deadalnix at gmail.com
Fri Feb 8 23:19:33 PST 2013


On Saturday, 9 February 2013 at 07:15:57 UTC, deadalnix wrote:
> On Friday, 8 February 2013 at 20:54:32 UTC, Jonathan M Davis 
> wrote:
>> On Friday, February 08, 2013 23:50:13 Dmitry Olshansky wrote:
>>> Complication - yes, slight performance cost is what I doubt 
>>> it in RA
>>> case. Seems like a compiler/optimizer issue.
>>
>> The main problem isn't RA ranges. It's quite probable that the 
>> optimizer takes
>> care of any minor additional overhead that's incurred there. 
>> The issue is
>> really pure forward ranges, because you're forced to count the 
>> code units as
>> they're processed (or even save the range as you go along, 
>> given how expensive
>> it would be to go find the index even if you have it). But we 
>> really have no
>> way of knowing how bad it is without hard data. It would 
>> certainly be trivial
>> to end up doing other stuff in the implementation which cost 
>> far more.
>>
>
> For pure InputRanges, that is pretty bad as the decoding of UTF 
> chars basically have to be done 2 times, and each codepoint 
> popped individually, or the lexer have to embed its own 
> homebrew version of std.utf .

Wow that is complete bullshit xD I completely messed up what I 
wanted to say.

So, std.utf.decodeFront pop or not the utf character. And in case 
it does not, you ends up having to do extra hanky panky (and 
duplicate logic in std.utf to know if it does pop or not, which 
is likely to be very bug prone).


More information about the Digitalmars-d mailing list